Abhijit Dasgupta 


set Theory 


With an Introduction 
to Real Point Sets 


® Birkhauser 


® Birkhauser 


Abhijit Dasgupta 


Set Theory 


With an Introduction to Real Point Sets 


® Birkhauser 


Abhijit Dasgupta 
Department of Mathematics 
University of Detroit Mercy 
Detroit, MI, USA 

dasgupab @udmercy.edu 


ISBN 978-1-4614-8853-8 ISBN 978-1-4614-8854-5 (eBook) 
DOI 10.1007/978-1-4614-8854-5 
Springer New York Heidelberg Dordrecht London 


Library of Congress Control Number: 2013949160 


Mathematics Subject Classification (2000): 03E10, 03E20, 03E04, 03E25, 03E15, 28A05, 54H0S5, 
03E30, 03E75, 03E50, 03E55, 03E60, 03E02 


© Springer Science+Business Media New York 2014 

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of 
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, 
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information 
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology 
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection 
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered 
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of 
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the 
Publisher’s location, in its current version, and permission for use must always be obtained from Springer. 
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations 
are liable to prosecution under the respective Copyright Law. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication 
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant 
protective laws and regulations and therefore free for general use. 

While the advice and information in this book are believed to be true and accurate at the date of 
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for 
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with 
respect to the material contained herein. 


Printed on acid-free paper 


Springer is part of Springer Science+Business Media (www.birkhauser-science.com) 


To my mother 
and 
to the memory of my father 


Preface 


Most modern set theory texts, even at the undergraduate level, introduce specific 
formal axiom systems such as ZFC relatively early, perhaps because of the 
(understandably real) fear of paradoxes. At the same time, most mathematicians 
and students of mathematics seem to care little about special formal systems, yet 
may still be interested in the part of set theory belonging to “mathematics proper,” 
i.e., cardinals, order, ordinals, and the theory of the real continuum. There appears 
to be a gulf between texts of mainstream mathematics and those of set theory and 
logic. 

This undergraduate set theory textbook regards the core material on cardinals, 
ordinals, and the continuum as a subject area of classical mathematics interesting 
in its own right. It separates and postpones all foundational issues (such as 
paradoxes and special axioms) into an optional part at the end. The main material 
is thus developed informally—not within any particular axiom system—to avoid 
getting bogged down in the details of formal development and its associated 
metamathematical baggage. I hope this will make this text suitable for a wide range 
of students interested in any field of mathematics and not just for those specializing 
in logic or foundations. At the same time, students with metamathematical interests 
will find an introduction to axiomatic ZF set theory in the last part, and some 
glimpses into key foundational topics in the postscript chapters at the end of each 
part. 

Another feature of this book is that its coverage of the real continuum is confined 
exclusively to the real line R. All abstract or general concepts such as topological 
spaces, metric spaces, and even the Euclidean spaces of dimension 2 or higher are 
completely avoided. This may seem like a severe handicap, but even this highly 
restricted framework allows the introduction of many interesting topics in the theory 
of real point sets. In fact, not much substance in the theory is lost and a few deeper 
intuitions are gained. As evidenced by the teaching of undergraduate real analysis, 
the student who is first firmly grounded in the hard and concrete details of R will 
better enjoy and handle the abstraction found in later, more advanced studies. 

The book grew out of an undergraduate course in introductory set theory that 
I taught at the University of Detroit Mercy. The prerequisite for the core material 
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of the book is a post-calculus undergraduate US course in discrete mathematics or 
linear algebra, although precalculus and some exposure to proofs should technically 
suffice for Parts I and II. 

The book starts with a “prerequisites” chapter on sets, relations, and functions, 
including equivalence relations and partitions, and the definition of linear order. The 
rest is divided into four relatively independent parts with quite distinct mathematical 
flavors. Certain basic techniques are emphasized across multiple parts, such as 
Cantor’s back-and-forth method, construction of perfect sets, Cantor—-Bendixson 
analysis, and ordinal ranks. 

Part I is a problem-based short course which, starting from Peano arithmetic, 
constructs the real numbers as Dedekind cuts of rationals in a routine way with 
two possible uses. A student of mathematics not going into formal ZF set theory 
will work out, once and for all, a detailed existence proof for a complete ordered 
field. And for a student who might later get into axiomatic ZF set theory, the 
redevelopment of Peano arithmetic and the theory of real numbers formally within 
ZF will become largely superfluous. One may also decide to skip Part I altogether 
and go directly to Part II. 

Part II contains the core material of the book: The Cantor—Dedekind theory of 
the transfinite, especially order, the continuum, cardinals, ordinals, and the Axiom of 
Choice. The development is informal and naive (non-axiomatic), but mathematically 
rigorous. While the core material is intended to be interesting in its own right, it 
also forms the folklore set-theoretic prerequisite needed for graduate level topology, 
analysis, algebra, and logic. Useful forms of the Axiom of Choice, such as Zorn’s 
Lemma, are covered. 

Part II of the book is about point sets of real numbers. It shows how the 
theory of sets and orders connects intimately to the continuum and its topology. 
In addition to the basic theory of R including measure and category, it presents 
more advanced topics such as Brouwer’s theorem, Cantor—Bendixson analysis, 
Sierpinski’s theorem, and an introduction to Borel and analytic sets—all in the 
context of the real line. Thus the reader gets access to significant higher results in a 
concrete manner via powerful techniques such as Cantor’s back-and-forth method. 
As mentioned earlier, all development is limited to the reals, but the apparent loss 
of generality is mostly illusory and the special case for real numbers captures much 
of the essential ideas and the central intuitions behind these theorems. 

Parts II and II of the book focus on gaining intuition rather than on formal 
development. I have tried to start with specific and concrete cases of examples 
and theorems before proceeding to their more general and abstract versions. As 
a result, some important topics (e.g., the Cantor set) appear multiple times in the 
book, generally with increasing levels of sophistication. Thus, I have sacrificed 
compactness and conciseness in favor of intuition building and maintaining some 
independence between the four parts. 

Part IV deals with foundational issues. The paradoxes are first introduced here, 
leading to formal set theory and the Zermelo—Fraenkel axiom system. Von Neumann 
ordinals are also first presented in this part. 
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Each part ends with a postscript chapter discussing topics beyond the scope of the 
main text, ranging from philosophical remarks to glimpses into modern landmark 
results of set theory such as the resolution of Lusin’s problems on projective sets 
using determinacy of infinite games and large cardinals. 

Problems form an integral and essential part of the book. While some of them 
are routine, they are generally meant to form an extension of the text. A harder 
problem will contain hints and sometimes an outline for a solution. Starred sections 
and problems may be regarded as optional. 

The book has enough material for a one-year course for advanced undergrad- 
uates. The relative independence of the four parts allows various possibilities for 
covering topics. In a typical one-semester course, I usually briefly cover Part I, spend 
most of the time in Part IJ, and finish with a brief overview of Part IV. For students 
with more foundational interests, more time can be spent on the material of Part IV 
and the postscripts. On the other hand, for less foundationally inclined but more 
mathematically advanced students with prior exposure to advanced calculus or real 
analysis, only Parts II and III may be covered with Parts I and IV skipped altogether. 
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Chapter 1 
Preliminaries: Sets, Relations, and Functions 


Abstract This preliminary chapter informally reviews the prerequisite material for 
the rest of the book. Here we set up our notational conventions, introduce basic set- 
theoretic notions including the power set, ordered pairs, Cartesian product, relations, 
functions, and their properties, sequences, strings and words, indexed and unindexed 
families, partitions and equivalence relations, and the basic definition of linear 
order. Much of the material of this chapter can be found in introductory discrete 
mathematics texts. 


1.1 Introduction 


Note. Jn this preliminary chapter, we informally use the familiar number systems N, 
Z, R, and their properties to provide illustrative examples for sets, relations, and 
Junctions. In the next three chapters all of these notions will be formally defined. 
Thus all our assumptions about these number systems are temporary and will be 
dropped at the end of this chapter. 


We assume basic familiarity with sets and functions, e.g., as found in elementary 
calculus. Some examples of sets are the real intervals: The open interval (a,b) 
consists of real numbers lying strictly between a and b, and the closed interval 
[a, b] consists of real numbers x satisfying a < x < b. The interval (—oo, oo) is the 
entire real line and is denoted by the special symbol R: 


R = (—00, 00). 
In addition we will be using the special symbols N and Z, where 


* Nconsists of the natural numbers starting from | (positive integers).! 
¢ Z consists of all integers—positive, negative, or zero. 


‘Usage varies for the interpretation of the term “natural number” and the symbol N. Many texts 
include 0 as a natural number, but we will not follow that convention. 
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2 1 Preliminaries: Sets, Relations, and Functions 
The Principle of Induction 


We will also assume some familiarity with the principle of induction for the positive 
integers N. Let P bea property of natural numbers. We will use the notation “P(n)” 
to stand for the assertion “n has the property P.” For example, P(”) may stand for 
“n(n? + 2) is divisible by 3.” 


The Principle of Induction. Let P be a property of natural numbers such that 


¢ P(1) is true. 
¢ For any natural number n, if P(m) is true then P( + 1) is true. 


Then P(n) is true for all natural numbers n. 


Problem 1. Show that the principle of induction is equivalent to the principle of 
strong induction for N which is as follows: 


Let P be a property of natural numbers such that 


¢ For any natural number n, if P(m) is true for all natural numbers m <n 
then P(n) is true. 


Then P(n) is true for all natural numbers n. 


The natural numbers and the principle of induction will be studied in detail in 
Chap. 2. 


1.2 Membership, Subsets, and Naive Axioms 


Naively speaking, a set A is a collection or group of objects such that membership 
in A is definitely determined in the sense that given any x, exactly one of “x € A” 
or “x ¢ A” is true, where the notation 


xeA 
is used to denote that x is a member of the set A, and the notation 
x¢@A 
stands for x is not a member of A. For example, we have 3 € (2, 00), 5 ¢Z,1EN, 
0 ZN, etc. 
We say that A is a subset of B, denoted by A C B, if every member of A is a 


member of B. We write A ¢ B to denote that A is not a subset of B. A C B is also 
expressed by saying that A is contained in B or B contains A. Thus we have 
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ACB & forall x,ifx ¢ Athenx € B, and 


AZ B <= thereis some x € A such that x ¢ B. 


We are using the symbol “<>” as a short-hand for the phrase “if and only if” (or 
equivalence of statements). Similarly, the symbol “=>” will stand for implication, 
that is “P = Q” means “if P then Q” or “P implies Q.” 

We will also often use the abbreviations “Vx(...)” for “for all x, ...” (the 
universal quantifier) and “Ax(...)” for “there is some x such that ...” (the 
existential quantifier). With such abbreviations, the lines displayed above can be 
shortened to: 


ACB $Vx(xeASxeB), and 
AZB #4aAx(x € Aandx ¢B). 


The principle of extensionality says that two sets having the same members must 
be identical, that is: 


A=B S$ Vx(xEeASxeB), 
which can also be stated in terms of subsets as: 
A=B <= ACBandBCA (Extensionality). 


The naive principle of comprehension is used to form new sets. Given any 
property P, we write P(x) for the assertion “x has property P.” Then the naive 
principle of comprehension says that, given any property P, there is a set A 
consisting precisely of those x for which P(x) is true. In symbols: 


JA Vx (x € AS P(x)) (Comprehension). 


We use the qualifier “naive” to indicate that the principle of comprehension uses the 
vague notion of “property,” and unrestricted use of the comprehension principle can 
cause problems that will be discussed later.” For now, we follow the naive approach 
of Cantor’s classical set theory, and the axioms of extensionality and comprehension 
(together with a couple more axioms such as the Axiom of Choice to be introduced 
in Chap. 5) will form the basis of development for our central topics of study.* 


?Such difficulties lead to consideration of metamathematical issues. We will be confining ourselves 
to purely mathematical aspects of set theory in the first three parts of the book. 

3This is a satisfactory approach for most areas of mathematics (and for most mathematicians) since 
the natural ways of forming new sets out of old ones such as taking subsets, forming the power set, 
and taking unions, do not seem to lead to difficulties. 
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Set Builder Notation 


By extensionality, the set A whose existence is given by comprehension from a 
property P is unique, so we can introduce the set builder notation 


tx] PQ} or tx: P(x)} 


to denote the unique set A consisting precisely of those x for which P(x) is true, 
i.e., the set A defined by the condition: x € A } P(x) (for all x). So, 


A= ({x|P(x)} ifandonlyif: forallx,x ¢A<} P(x). 
For example, we have: 
[a,0oo) = {x| x € Randa < x}. 


In this example the resulting set [a,0o) is a subset of R. In general, when a new 
set B is defined as a subset of an old set A as those members of A which have the 
property P, that is when 


B={x|x € Aand P(x)}, 
we will often use the alternative notation 


B={xe€A| P(x)}. 


The Empty Set and Singletons 


Perhaps the simplest set is the empty set @ which has no members. (Its existence can 
be proved using the naive comprehension principle by taking P(x) to be “x 4 x.”) 
The empty set is a subset of every set: 


@ C A for all sets A. 


For any a, the singleton set {a} is the set whose only member is a: 
{a} := {x|x =a}. 


We will often use the notation “:=” when definitions are introduced. 

The singleton set {a} should be distinguished from the element a. For example 
{@} and @ cannot be the same since the first one has a member while the second has 
no members, and a set which has a member cannot be identical with a set with no 
members (by extensionality). 
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Problem 2 (Royden). Prove that if x € @, then x is a green-eyed lion. 
Problem 3. Prove that {a} = {b} if and only ifa = b. 


Problem 4. Prove that the sets {{{{O}}}} and {{@}} are distinct. For each of these 
two sets, determine if it is a singleton. 


The Brace-List Notation 


More generally, we can denote sets consisting of multiple members using the brace- 
list notation, as in: 


{a,b} := {x|x =aorx =D}}, 


{a,b,ch := {x|x =aorx =borx=c}, ete. 


The set {a, b} is sometimes called an unordered pair. 


Problem 5. Prove that {a,b} = {b,a} = {a,a,b,a}. Can {a, b} be a singleton? 


“cc ” 


More informally, we often use the brace-list notation together with dots 
(ellipsis) where not all the elements are listed, but the missing elements can readily 
be understood from the notation. For example, 


{1,2,...,100} standsfor {x €N| x < 100}. 
This is used for infinite sets as well, as in 


N = {1,2,3,...,7,...}. 


If A is a set and a(x) is an expression involving x which is uniquely determined for 
each x € A, then we use the notation 


ta(x)| x € A} 


as a convenient abbreviation for the set {y | y = a(x) for some x € A}. For 
example, 


{2n—1|n €N} 


denotes the set of all odd positive integers. This notation is also extended for 
expressions with multiple variables. For example, 


{a(u,v)| ue A, ve B} 
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stands for {x | x = a(u, v) for some u € A andv € B}. Thus 
{m? +n?| m,n €N} 


denotes the set of integers which can be expressed as a sum of two perfect squares. 


1.3. The Power Set and Set Operations 
The Power Set 


The power set P(A) of a set A is defined to be the set of all subsets of A: 

P(A) := {x| x C A}. 
Problem 6. What is P(@)? Find P(P(@)) and P(P(P(@))). If a,b,c are distinct 
elements, find P({a, b, c}). 


Problem 7. Show that if a set A has n elements then P(A) has 2” elements. 


Problem 8. A set A is called transitive if every element of A is also a subset of A. 


1. Among ®, {@}, and {{@}}, which ones are transitive? 

2. Find a transitive set with five elements. 

3. Prove that a set is transitive if and only if its power set is transitive. 
4. Can you find an example of an infinite transitive set? 


Set Operations 


Given sets A and B we define their 


Union: AUB :={x|x e€Aorxe B} 

Intersection: ANB :={x|x € Aandx € B} 

Difference: ANB :={x|x €Aandx ¢ B}. 
We will assume basic familiarity with these operations which are often illustrated 
by means of Venn diagrams. 
Problem 9. Show that (Ax B) U (BA) = (AU B) (AN B). 


The set (AX B) U (BA) is called the symmetric difference of A and B, and denoted 
by AAB. 
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Universal Sets and Complementation: In many situations there will be a fixed set U 
such that all sets under consideration will be subsets of U. The set U then becomes 
the largest set among those being considered, and is sometimes called a universal 
set, with the power set P(U) being the universe. For example, in the context of the 
real line, all sets being discussed may be subsets of R. In such cases, where the fixed 
largest set U does not change and can be understood from context, it is convenient to 
write UXA simply as A’, called the complement of A, giving rise to the set operation 
of complementation relative to U. 


The Algebra of Subsets of a Fixed Set 


In the last situation described, where all sets being considered are subsets of 
a universal set U, the collection P(U), together with the operations of union, 
intersection, and complementation, is known as the Boolean Algebra of Subsets of 
U, or simply the Algebra of Subsets of U. For the algebra of subsets of U, the set 
operations satisfy many properties, most of which are readily derived. We list them 
in the following problems. 


Problem 10. Show that, for all sets A, B,C: 


AUA=A=ANA. 

AUB=BUAandANB=BNA. 

. AU(BUC) = (AUB)UC and AN (BNC) =(ANB)NC. 

. AN (BUC) =(ANB)U(ANC) and AU(BNC) =(AUB)N(AUC). 
. AX(B UC) = (ANB) (ANC) and AX(B NC) = (AXB) U (ANC). 
ANBCACAUB. 

ACBSAUB=BSANB=ASANB=D. 


NAWAWNS 


Problem 11. Suppose that U is a set such that all sets A,B,C,... under 
consideration are subsets of U, and write A! for U~A. Show that 


AU@=A=ANU. 

@ =U, U'! =@,and(A') = A. 
A=B'SB=A' SANB=GandAUB=U. 
ANA’ =@andAUA' =U. 
AN@=@OandAUU =U. 
ANB=OSAUB=USACB’ SBCA 
. (AU BY = A'N B'and(AN BY = A'UB’. 


NAWAWNS 


The last equalities in the list are called the DeMorgan laws. 


Problem 12. 7. AX B = Ax(AN B) and AX(A~B) = ANB. 
2 AUB=OSA=B=G,andA=B<AAB=@D. 
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1.4 Ordered Pairs and Relations 


We will let (a,b) denote the ordered pair consisting of a and b in the order 
of appearance. Its exact definition will not matter to us; any definition will be 
satisfactory so long as (a,b) is uniquely defined for all a,b, and satisfies the 
following property: 


(a,b) = (c,d) > a=candb=d (for all a,b,c, d). 


We will leave the ordered pair as an undefined primitive notion satisfying this 
characterizing condition.* 

It is important to distinguish the unordered pair {a,b}, for which the “commuta- 
tive property” {a,b} = {b, a} always holds, from the ordered pair (a,b), for which 
(a,b) = (b, a) will hold only if a = b. 


Cartesian Product 


The Cartesian product A x B of two sets A and B is defined as the set of all ordered 
pairs (a,b) witha € Aandbe B: 


Ax B := {(a,b)|a€ A, b € Bh. 


We abbreviate A x A as A”. For example, the familiar Cartesian plane R? = Rx R 
is the set of all ordered pairs (a, b) where a and b are real numbers. 


Relations 


A relation is defined to be any set consisting only of ordered pairs. Thus: 
Risarelation < forall x € R, x = (a,b) for some a,b. 
We will use the notation xR y to denote (x, y) € R, and -=xRy to denote (x, y) ZR. 
The domain and range of a relation R, denoted by dom(R) and ran(R) 


respectively, are defined as: 


dom(R) := {x|xRyforsome y}, and ran(R) := {y| xRy for some x}. 


4This was the standard practice until Wiener showed in 1914 that the notion of ordered pair can be 
reduced to a definition in terms of sets by taking (a, b) := {{a}, {b, O}}. This was later improved 
by Kuratowski in 1921 to the current standard definition (a,b) := {{a}, {a, b}}. The interested 
reader may want to verify as an exercise that both of these are satisfactory definitions for the 
ordered pair. 
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Problem 13. Find the domain and range for each of the following relations. 


Ry = {(x,y) € R’| xy = Jj, 

Ry := {(x,y) €R’? |x? +y? = 1}, 
R3 := {(x, y) € R?| x = siny}, 
Ra := {(x, y) € R?| x? < y}. 


If R is a relation then R C dom(R) x ran(R) and so a relation could also have been 
defined using the condition in the following problem. 


Problem 14. R is a relation <> RC Ax B for some sets A and B. 
We say that R is a relation on Aif RC Ax A. 
Problem 15. /f R is a relation, then R is a relation on some set A. 


If R is a relation, then its inverse relation R~ is defined as 
Ro! := {(u,v)| (v,u) € R}. 


For example, if R = {(x, y) € R?| x < y}and S = {(x, y) € R’?| x > y}, then 
R7! = § and S~! = R. It is easily verified that (R7!)~! = 

If R and S are relations, their relative product (or composition), denoted by R- S 
or by RS, is defined as R- S := {(x, y)| For some u, xRu anduRy}. 


Problem 16. Let S be the relation on R defined as 
S:= {(x,y) €R| -l<x-y<]}. 


What is S - S? Draw figures showing S and S - S on the Cartesian plane. 
Problem 17. Show that (R-S)~! = S7!. R71. 


Properties of Relations 


Let R be a relation on A. We say that 


. R is reflexive on A if xRx, for all x € A; 

. Ris irreflexive on A if ~xRx, for all x € A; 

. Ris symmetric on A if xRy = yRx, forall x,y € A; 

. Ris asymmetric on A if xRy => 7yRx, forall x, y € A; 

. R is antisymmetric on A if xRy and yRx > x = y, forall x, y € A; 
. R is transitive on A if xRy and yRz => xRz, for all x, y,z € A; 

. Ris connected on Aifx A y => xRy or yRx, forall x, y € A; 


NYDN WN KE 
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Problem 18. Given a set A, put A4 := {(x,x)| x € A. (Ag is called the diagonal 
or identity on A.) If R is a relation on A, show that: 


R is reflexive on A <> Ay C R; 

R is irreflexive on A<@> RN A, = @; 

R is symmetricon AS& RCR!SR=R!SR'!ICR; 
R is asymmetric on A RN R7! =@; 

R is antisymmetric on A @ RM R7!CA,; 

R is transitiveon A@ R- RC R; 

R is connected on A& RU R7!UA,=AxA. 


NAUWAWND 


Problem 19. Let S be the relation of non-equality on the set A, that is, xSy 
x,y € Aandx # y. Then S is irreflexive, symmetric, and connected on A. 


Moreover, if R is any relation on A which is irreflexive, symmetric, and connected 
on A, then R=S. 


Problem 20. Given a set A with n elements, how many relations are there on A 
which are both symmetric and connected? How many are reflexive? How many are 
irreflexive ? How many are neither? 


Transitivity is an important property of relations. For transitive relations, the 
properties of irreflexivity and asymmetry coincide. 


Problem 21. /f R is a transitive relation on A, show that R is irreflexive on A if 
and only if R is asymmetric on A. 


1.5 Functions 


A relation F is said to be a function if xFy andxFz > y = z, for all x, y,z. If 
F is a function and x € dom(F), then there is a unique y such that x Fy, and we 
denote this y by F(x), the usual functional notation. 

We also say that F is a function from A to B, written using the standard notation 


F:A— B, 


to mean that F is function with dom(F’) = A and ran(F) C B. In this case, it 
is common to abuse terminology and refer to the triplet (F, A, B) as “the function 
F:A — B.” The set B is then sometimes referred to as the converse domain or 
co-domain of the function F: A — B (more precisely, B is the co-domain of the 
triplet (F, A, B)). 

Functions are also called mappings. If A is a set and a(x) is an expression 
involving the variable x which is uniquely determined for each x € A, then the 
relation 


F := {(x,y)| x € Aand y = a(x)} 
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is a function with domain A. This function F is sometimes referred to as “the 
mapping x +> a(x) where x € A,’ and can be written more simply as 


F = {(x,a(x))| x € A}. 


Function Builder Notation 


It is very convenient to further simplify the notation and to denote the function F 
using the function builder notation: 


F = (a(x)| x € A). 


Notice the use of angle-brackets in place of the curly braces. The function builder 
notation is highly useful in defining new functions. For example, the relation 


R= {(x,y) €R?|x € [0,1], y=x? 4+ 


is a function with domain [0, 1] which is denoted by (x? + 1| x € [0, 1]). 

Two functions F and G are said to agree on a set Aif A C dom(F), A C 
dom(G), and F(x) = G(x) forall x € A. 

If F:A > BandC C 4A, then the restriction of F to C, denoted by F|c or 
F }C, is the function with domain C defined as 


Fle = {(x, y)| x © C and (x,y) € F}. 


In this case we also say that F is an extension of G. Note that F|c could also be 
defined as (F(x)| x € C) and is the unique function with domain C which agrees 
with F on A. 

Let F: A — B. For each C C A we define the (forward) image of C under F, 
denoted by F[C], as the set 


F[C] := {F(x)| x €C}. 


Thus F[C] = ran(F|c). Similarly, foreach D C B, we define the inverse image of 
D under F, denoted by F—'[D], as the set 


F[D] := {x € A| F(x) € D}. 


Problem 22. Let F:X > Y, A,B C X, andC, D CY. Show that 


1. F[AU B] = F[A] U F[B] and F[AN B] © F[A] NO F[BI. 
2. The equality F[AN B] = F[A] N F[B] may fail. 
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3. F"[C UD] = FC] U F"[D] and F™[C 0 D] = FO [C]N FO [DI. 
4. FAUY<C)=X<F=|C], 


If F:A — Band G:B — C are functions, then their composition G o F is the 
function G o F: A > C defined by 


Go F := (G(F(x))| x € A), 


which is well defined since ran(F’) C dom(G). 
Problem 23. Show that function composition is associative. 


Problem 24. If F:A — B, G:B > C, X C A, andY C C, then we have 
(G o F)[X] = G[F[X]] and (G 0 F)""[¥] = F'[G"[Y]]. 


A function F: A > B is said to be one-to-one or injective if 
F(u) = Fv) > u=v (for all u, v € A). 


Note that a function F is one-to-one if and only if the inverse relation F—! is a 
function (in this case we will have dom(F~') = ran(F)). 


Problem 25. Show that F: A — B is injective if and only if for all X,Y C A we 
have F[X NY] = F[X]N F[Y]. 


A function F: A > B is onto or surjective if ran(F) = B, 1.e., if 
for each y € B there is x € A such that y = F(x). 


Note the terminological abuse mentioned earlier. The term “onto” or “surjective” 
really applies to the triplet (F, A, B). 

A function F: A — B which is both one-to-one and onto is called a one-to-one 
correspondence or a bijection from A onto B (or a bijection between A and B). 
When A = B, that is if F: A > A is a bijection, we say that F is a bijection on A. 

For example, if Z is the set of all integers, positive, negative, or zero, A is the set 
of all odd integers in Z, and B is the set of all even integers in Z, then the function 
F := (n+ 1|n € Z) is a bijection on Z, while F |, is a bijection from A onto B 
and F |g is a bijection from B onto A. 


Problem 26. For any set A, define a bijection between the set of all reflexive 
relations on A and the set of all irreflexive relations on A. 


Problem 27. Show that 

1. For any set A, the identity mapping on A given by {(x| x € A) is a bijection on A. 

2. If F: A = B is an injection, then F: A > F{[A] is a bijection (where F[A] = 
ran(F)). 

3. If F is a bijection from A onto B, then F~ is a bijection from B onto A. 
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Problem 28. Let F:A — B, G:B — C, and let Go F:A — C be their 
composition. If F and G are injective, so is G o F, and if F and G are surjective, 
so is Go F. Conclude that a composition of bijections is a bijection. 


Problem 29. Let F: X — Y. Show that 


1. F is injective if and only if there isa G: Y — X such that the composition G o F 
equals the identity function on X. 

2. If there is a G:Y — X such that the composition F o G equals the identity 
function on Y, then F: X — Y is surjective. 


The converse of the second result in the last problem holds under the axiom of choice 
which will be introduced and studied in a later chapter. 


Problem 30. Prove that for any infinite subset A of the set N of positive integers, 
there is bijection between N and A. 


If A, B are sets, then B4 denotes the collection of all functions from A to B: 
BA := {f | f:4— B}. 


Thus f ¢ BA & f:A>B. 


1.6 Families and Partitions 


Indexed Families 


A function E with domain J = dom(£) will also be called an indexed family with 
index set I. In this case it is customary to denote E(i) by E; for eachi € J, and 
denote the entire indexed family, that is the function £, as: 


E=(E ie). 


If E; is a set for eachi € J, then we say that (£; | i € ) is an indexed family of sets 
(with index set /). Thus P(X)! is the collection of all indexed families of subsets 
of X with index set /. 

Given an indexed family of sets (E; | i € I), we define its union 


Jz: := {x| x € E; forsomei € J}, 
iel 


and intersection 


(\ Ei := {x| x € E; foralli € 1}. 


ie] 
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An indexed family of sets (E; | i € I) is said to be pairwise disjoint if 
ifj>a> ENE; =@ (for alli, 7 € J). 


When the index set is N, the indexed family (E£,, | € N) is called a sequence of 
sets, and we use the following notations: 


8 


oe) 


E, := Bea and (\ En = ("\.Ea: 
1 


neN n=1 n€N 


n 
Of course, this notation can naturally be extended to the case where the starting 
index is an integer other than 1. 


Problem 31 (De Morgan’s Laws). Let U be a fixed “universal” set, and for E © 
U, let E' := UXE denote the complement of E. If (E; | i € I) is any indexed family 
of subsets of U, show that 


(Us) =O and (M8) Us. 


ie] ie] ie] ie] 


Problem 32. If f: X — Y and (E;|i € 1) is an indexed family of subsets of X, 
then 


[Ue] =U sei. 


ie] ie] 


Problem 33. If f: X — Y and (F; lj € J) is an indexed family of subsets of Y, 
then 


ATURl=U rte ad [Ne] =e 


jes jes jes jes 


Unindexed Families (or Collections) of Sets 


If C is a set whose every member is itself a set, we say that C is a family of sets 
(unindexed) or a collection of sets. For example, P(A) is a family of sets. If each 
member of C is a subset of a fixed set X, or equivalently, if C C P(X), we say that 
C is a collection of subsets of X or a family of subsets of X. Thus P(X) is the largest 
family of subsets of X. 
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If C is any (unindexed) family of sets, we can convert C into the indexed family 
(A| A € C), which is the identity function on C. Hence the notation 


LJ 4 


AEC 


can be used to denote the union of members of C, but in this case we can use the 
simpler notation 


Ue= a 


AEC 


Thus UC is the set of members of members of C, that is: 
x €UC © xe€Aforsome A €C. 


Similarly, we define: 


NC:= (\ 4, andso x ENC } x e€Aforevery AEC. 
AEC 


Problem 34. Show that UC is the “smallest set containing every set in C” in the 
sense that it contains every set in C and is contained in any set which contains every 
set inC. 

Similarly show that NC is the largest set contained in every set in C, that is, it is 
contained in every set in C and contains any set which is contained in every set in C. 


Problem 35*. What is NC if C is empty? 


Partitions 


A family C of sets is called (pairwise) disjoint if any pair of sets in C are either 
identical or disjoint, i.e., if forall A,B €C,A= BorANB=@. 

We say that a family C covers or exhausts a set X if every element of X belongs 
to some set in C (i.e., if for all x € X there is A € C such that x € A), or 
equivalently if X C UC. 

We say that C is a partition of X if C is a disjoint family of nonempty subsets of 
X which covers X. More precisely, C is a partition of X if 


* Cis a family of subsets of X¥ (C C P(X)); 

¢ No member of C is empty (B € C> B # @); 

¢ Distinct sets in C are disjoint (A, B € Cand A #4 B > AN B=@®); 
¢ Ccovers X (forall x € X,x € A forsome A € C, or X = UC). 
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Problem 36. List all possible partitions of {a, b,c} (with a,b,c distinct). 
Problem 37. Let E; := {2k —1|k € N} be the set of odd positive integers, and 
inductively define, for eachn = 1,2,..., 


FEn4i1 = {2k | ke E,}. 


Show that {E;, | n € N} is a partition of N. 


Problem 38. For each of the following, determine if C is a partition or not. 


(av C=@; (b)C = {BW}; (c)C = {{O}}. 


1.7 Finite and Infinite Sequences and Strings 


The notion of ordered pair can be generalized to that of a finite sequence, which can 
have any finite number of entries instead of just two. For example, (a, b,c) denotes 
the ordered triple consisting of the entries a,b,c in the order of appearance. In 
general, we will use the notation (a), d2,...,@,) to denote the ordered n-tuple or the 
finite sequence of length n consisting of the entries a;,da2,...,d, in the displayed 
order. Its defining property is: 


(41, 4o,..-,4n) = (bi, b2,..., bn) => ay = bj, a2 = bo, ..., dy = dy. 
A finite sequence of length n can be officially defined as a function whose domain 
is the set {1,2,...,n} = {k €N| 1 <k <n} of the first n natural numbers. So an 
n-tuple a is a function a: {1,2,...,m} — ran(a), with k-th entry being a(k). It is 
then customary to abbreviate a(k) as a;,, and we have 


a = (a(1),a(2),...,a(m)) = (a), 40,...,4n) = (ax| 1 <k <n), 


where the last expression on the right uses the function-builder notation.° 
Cartesian products are also generalized by defining 


A, X Az X +++ & An = {(G1,0,...,4n)| a) € Al, do € Ad, ... Gy € An}. 


>The case n = 2 causes a notational conflict between the ordered pair and the finite sequence 
of length 2, as both are denoted by (a,b). To be pedantic, we could use a separate notation for 
n-tuples (such as [a1,d2,..., a, |), but our ambiguous notation will hardly cause any real trouble 
in informal set theory. A similar remark applies to the case n = 1 where we identify A! with A by 
confusing (a) as a. 
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If all the “factors” of a Cartesian product are equal, we use the abbreviations: 


A? :=AxA, A= AXAXA, ..., AM1= AXAXK- XA, 
—_———-————— 


n factors 


Thus A” is the set At!---} of all finite sequences of length n with entries from 
A, which is consistent with the familiar notations R? for the Cartesian plane and 
R? for the usual 3-space. We also identify A! with A. Finally, when n = 0 the set 
{1,2,...,m} is empty and the only function with empty domain is the empty set @ 
itself, so we have A° = A® = {@}. 

The collection of all finite sequences from A of all possible lengths will be 
denoted by A*: 


lo, e) 
At := 'S A". 
=0 


In particular, A* includes the empty sequence @ which has length 0. 
We will also consider non-terminating infinite sequences of the form 


(41, 4o,...,4x,...) = (ax| kK EN). 
As suggested by the function-builder notation on the right above, an infinite 
sequence from a set A is officially defined to be a function a:N — A. Thus AN 
is the set of all infinite sequences from A. Once again, it is customary to abbreviate 
a(k) as a;, so thata = (a; | k € N) fora € AN. Also, we will sometimes abbreviate 
the set {0, 1} of all infinite binary sequences as 2N. 
To summarize, a member of a € A* (a finite sequence) can be written as 


a = (a1, 42,...,An) = (ax|1<k <n), 


for some n > 0 (we get the empty sequence by taking n = 0), while a member of 
a € AN (an infinite sequence) can be written as 


a = (a1, 42,...,dx,...) = (ax | kK EN). 


Alphabets and Strings 


Sometimes it is more convenient to regard the set A as an alphabet whose elements 
are symbols or letters, and write the sequence a = (da), d2,...,@,) more simply as 
a word or string of symbols, as in: 


a = a,2°°-Ay. 
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In such contexts, the empty sequence is called the empty string or the empty word, 
and is denoted by é (instead of @); it is the unique string of length zero. The length 
of a string u is denoted by len(u). 

For example, when A = {0,1}, we say that A is the binary alphabet consisting 
of the two binary digits (or bits) 0 and 1. A string from A will now be a word 
composed of the symbols 0 and 1, such as “10001110” or “00101,” and we have the 
set of binary strings: 


{0, 1}* := {e, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, ... }, 


where for each n there are 2” binary words of length 7. 

If @ = ajd)---d» and b = b,b2---b, are finite strings of length m and n 
respectively, we say that a is an initial segment or prefix of b if m <n and ay = by 
for all k < m. In this case, we also say that b is an extension of a (or b extends a). 
When b extends a and len(b) = len(a) + 1, we say that b is an immediate extension 
of a. 

If u = u,un--- Up 1s a String of length m and v = v,v2---v, Is a string of length 
n, we can form their concatenation, denoted by u* v, to be the string of length m+n 
obtained by “writing u followed by v,” as in: 


UV = U2 +++ UmV1V2°** Vn. 


Thus wis a prefix of w if and only if w = uv for some string v. Note that len(u*v) = 
len(u) + len(v). 

Let A be an alphabet. If u € A” is a finite string from A and s € A is a letter 
in A, we use the notation u~s to denote the immediate extension of u obtained by 
suffixing it with the letter s, that is, if u = uyu2--- Um With u1,U2,...,Um € A, then 


uos = u* (8) = Uj -+ + US. 


Thus len(u~s) = len(u) + 1. 

It is often useful to regard an infinite sequence a € AN as an infinite string 
a = 4)a2°-+d,--: Of letters from the alphabet A. If u = u,u2--+u», is a finite string 
and v = vjvo-+-Vvg-++ is an infinite string, then we say that u is a (finite) initial 
prefix of v, or that v extends u, if up = vy fork = 1,2,...,m. 

Ifa = a,a2---a, +++ is an infinite string, we use the notation 


a|n = a\a2°+-Gn, 
to denote the finite initial prefix of a obtained by truncating a to its first 7 letters. 


Thus a|0 = ¢, all = ay, al|2 = aya, a|3 = a,a2qQ3, etc, and a extends a|n for 
every 1. 
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Finally, given a finite string @ = adjd2---d, and an infinite string b = 
bby --- by +++, we can concatenate them to form the infinite string a * b as 


a* b 2= a\A2+++ Amb bz-+-by--- , 
or more formally as the infinite string c = c,c2...c,-++, where 


ak ifk <m, 
Ch i= 
be-m ifk > m. 


1.8 Partitions and Equivalence Relations 


A relation R ona set A is said to be an equivalence relation on A if R is reflexive, 
symmetric, and transitive on A. The symbols ~ and = are often used to denote 
equivalence relations, and we say x is equivalent to y to express x ~ y. Thus ~ is 
an equivalence relation on A if and only if: 


1. Reflexivity; xw~x (for all x € A); 
2. Symmetry: x~y>yn~x (for allx,y € A); and 
3. Transitivity: x~yandy~w~z>xw~z_ (forallx,y,ze€ A). 


The identity relation = is an equivalence relation. In fact, the notion of equivalence 
relation can be viewed as a generalization of the notion of identity. 


Problem 39. Let F be a function with dom(F) = A, and for x, y € A, define 
x~y @ F(x) = FO). 


Show that ~ is an equivalence relation on A. 


Given any equivalence relation ~ on a set A, a function F' with domain A is called 
a complete invariant for the relation ~ if “F reduces ~ to the identity =” in the 
following sense: 


x~y © F(x) = Fi) (for all x, y € A). 


If ~ is an equivalence relation on A and x € A, the ~-equivalence class of x, 
denoted by [x]~, or simply by [x] if there is no risk of confusion, is defined as: 


[x] = [X]~ := {y € A] x~ yh. 


Thus [x] is the set of elements equivalent to x, and so y € [x] & y~wx. 
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Problem 40. Let ~ be an equivalence relation on A. Prove that x € |x] for all 
x € A. Prove also that for any x, y € A, either [x] = [y] or [x] N [y] = @. 


Problem 41. Let ~ be an equivalence relation on A. Prove that if x,y € A, then 
[x] = [y] ifand only ifx ~ y. 
Therefore every equivalence relation has a natural complete invariant: 


Theorem 42 (Principle of Abstraction). Given any equivalence relation ~ on a 
set A, the mapping x +> [x]~, which assigns to every element its own equivalence 
class, is a natural complete invariant for ~, i.e., 


x~y @ [x]~=[y] (forall x,y). 


The mapping x +> [x]~ is called the quotient map given by ~. 


The following theorem exhibits a natural one-to-one correspondence between 
equivalence relations and partitions over a given set A, thus bringing out the fact 
that the two notions “equivalence relation” and “partition” in a sense represent the 
same concept (i.e., each can be regarded as form of the other). 


Theorem 43 (Identifying Equivalence Relations with Partitions). Given an 
equivalence relation ~ on A, the family I (~) of all distinct ~-equivalence classes 
forms a partition of A such that 


x~y < x and y belong to some common set in the partition IT (~). 
Conversely, given any partition C of A, the relation E(C) on A defined by 
xE(C) y > there is some B € C such that x,y € B 


is an equivalence relation on A such that C equals the family of all E(C)- 
equivalence classes. 
Moreover, we have 


EUT(~)) =~, for any equivalence relation~, and 


IT(E(C)) = C, for any partition C. 


Problem 44. Prove Theorem 43. 


Given an equivalence relation ~ on A, the partition 17(~) of A consisting of all 
the ~-equivalence classes is often denoted by A/~, and is called the quotient of A 
modulo ~. The quotient map x +> [x] of Theorem 42 (the natural complete invariant 
for ~) is then a surjection of A onto A/w~. 


Problem 45. Let Z be the set of all integers, positive, negative, and zero. We write 
x | y to express “x divides y,” i.e., y = xz for some z € Z. Define two relations = 
and ~ on Z by the conditions 
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x=ys4|(x-y) and x~ysx|yandy|x, 
Show that both = and ~ are equivalence relations on Z. In each case, what are the 
equivalence classes and what is the partition? 
Problem 46. Let R? := {(x,y) | x,y € R} be the usual plane, and define a 


relation ~ on the plane by 


(x1, 91) ~ (%2, y2) @ X41 +2 = XQ +). 


1. Show that ~ is an equivalence relation on R*. 
2. Describe the equivalence classes and the partition. 
3. Find a complete invariant for ~. 


Problem 47. Let N := {1,2,3,...} be the set of natural numbers. Define an 
equivalence relation = on N by: 


m=n © forallkeN: 2*|mo2|n. 


Describe the equivalence classes and the partition given by =. Can you find a 
complete invariant for this equivalence relation? 


Problem 48. Let N := {1,2,3,...} be the set of natural numbers. Define an 
equivalence relation ~ on N by: 


m~n << _ forevery prime p: p|m< p\|n. 


Describe the equivalence classes and the partition given by ~. Can you find a 
complete invariant for this equivalence relation? 


Problem 49. Define an equivalence relation ~ on the set R of reals by 
xXx~y } cosx =cosy. 


Precisely describe the equivalence classes and the corresponding partition. 


1.9 Orders (Linear Orders) 


We will study orders in detail starting from Chap. 7, but a few basic notions needed 
before Chap. 7 will be introduced here. 
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We say that a relation R is a (linear) order ona set X, or that (X, R) is a (linear) 
order, or simply that R orders X, if R is a transitive relation on X satisfying the 
trichotomy property: For all a,b € X, exactly one of 


aRb, a=b, bRa, 


holds. (Thus if R orders X anda,b € X are distinct (a # b), then either aRb or 
bRa, but not both.) 


Problem 50. If R orders X, then R is irreflexive and asymmetric on X. Moreover, 
we have: R orders X if and only if R is a relation on X which is transitive, 
asymmetric, and connected on X. 


Notation and Terminology. If R orders X, we write x <p y to denote xRy. When 
there is no chance of confusion, we even drop the subscript R and simply write 
x < y for xRy, and so x < y means xRy or x = y. We will also say that “X is 
order” in place of “(X, <) is an order.” 


In a general order X, intervals are defined using the familiar notations for the 
real number line. Subsets such as (a,b) := {x € X | a < x < b} and (a,o0) := 
{x € X | a < x} are open intervals, while examples of closed intervals are [a, b] := 
{x € X|a <x <b} and (—oo, a] := {x € X | x < a}. Similarly, we could also 
define half-open intervals such as [a, b). In a general order, the interval (—oo, a) 
will usually be denoted by Pred (a) or Pred(a). 


Let < be an order ona set X. 

If A C X, then an element a € X is a first or least element of A if a € A and 
for allx € A,x #a => a < x. Similarly we define last and greatest elements. 
An element a € X is called an endpoint of the order X if a is either a first or a last 
element of the entire set X. 

If x,y € X, we say that x is an immediate predecessor of y, or equivalently 
that y is an immediate successor of x, if x < y and there is no z € X such that 
x <z<_y. Itis easily seen that each element has at most one immediate successor 
or immediate predecessor. We also say that two elements are consecutive elements 
if one of them is an immediate successor of the other. 

The order < on X is said to be a dense order if for all x, y € X, if x < y then 
there is some z € X with x < zandz < y. Thus an order is a dense order if and 
only if it does not have any pair of consecutive elements. 

The order < on X is said to be a well-order if every nonempty subset A C X has 
a least elementa € A. 


For example, let X be the set N of natural numbers with their usual order of 
magnitude. Then the element | is the first element of X, but X does not have a last 
element. If m,n € X, then m is an immediate predecessor of n (or equivalently n 
is an immediate successor of m) if and only ifn = m + 1, and pair elements are 
consecutive if their difference equals 1. 

Many interesting examples of orders are obtained by fixing X to be a subset of 
R and taking < to be the usual order of magnitude among the elements of X. For 
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example, for any a < b in R, the real interval [a, b] with the usual order is a dense 
order with first element a and last element b. 


Problem 51. For each of the following, give an example of an order satisfying the 
stated condition. 


I. A dense order (i.e., without consecutive elements) which has a last element but 
no first element. 

2. An order on an infinite set which has a first and a last element and such that each 
element except the last has an immediate successor and each element except the 
first has an immediate predecessor. 

3. An order having a unique element which has neither an immediate successor 
nor an immediate predecessor while every other element has both an immediate 
successor and an immediate predecessor. 


Part I 
Dedekind: Numbers 


Introduction to Part I 


The primary goal of Part I is to construct the real numbers starting with the natural 
numbers as the only foundation. 

Chapter 2 derives standard properties of naturals numbers from the Dedekind— 
Peano axioms and then develops the ratios (positive rationals). The chapter ends 
with an optional section on Dedekind’s general method of recursive definition 
(primitive recursion). 

Chapter 3 covers the definition of continuity in the context of linear orders, 
leading to the notion of a linear continuum and the satisfaction of the intermediate 
value theorem. It gives a construction of the real numbers using the method of 
Dedekind cuts. 

The philosophical postscript to this part (Chap. 4) discusses two different 
approaches, namely Frege—Russell absolutism and Dedekindian structuralism, 
which are applicable not only in the conception of the natural numbers, but also 
more generally in the wider context of mathematics. 

A great deal of the material of this part is due to Dedekind. This includes 
the Dedekind—Peano axioms, definition by primitive recursion, categoricity of 
Dedekind—Peano systems, definition of a linear continuum, the construction of 
irrational numbers via cuts in rationals, and the structuralist approach to the natural 
numbers. Most of the material of Chap. 2 correspond to Dedekind’s 1888 work [11] 
(Was sind und was sollen die Zahlen?), and that of Chap. 3 to his earlier 1872 
work [10] (Stetigkeit und irrationale Zahlen). 


Note. Jn the informal preliminary Chap. 1, we temporarily assumed the existence 
and properties of integers and real numbers to provide examples for sets, relation, 
and functions. In this part we will drop all such assumptions and derive everything 
from the Dedekind—Peano axioms. Familiar notions, like addition, are not assumed 
to be known until formally introduced. 
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Chapter 2 
The Dedekind—Peano Axioms 


Abstract This chapter develops the theory of natural numbers based on 
Dedekind—Peano Axioms, also known as Peano Arithmetic. The basic theory of 
ratios (positive rational numbers) is also developed. It concludes with a section on 
formal definition by primitive recursion. 


2.1 Introduction 


With the real numbers and their properties as a starting point, a large part of classical 
mathematics known as analysis can be developed deductively. This includes analytic 
geometry, calculus, the theory of sequences and series of real and complex numbers 
and functions, differential equations, and so on. 

Mathematicians in the nineteenth century such as Weierstrass, Dedekind, and 
Cantor produced further analysis and construction of the real numbers which 
reduced everything down to the notion of natural numbers N. 

It thus became clear that (with the aid of a certain amount of set theoretic and 
logical apparatus) the entire body of traditional pure mathematics can be constructed 
rigorously starting from the theory of natural numbers.! 

Dedekind, in his profound work [11], and Peano, in his clear and highly 
modern axiomatic development [59], showed how, in turn, the entire theory of 
natural numbers could be derived from a few basic axioms and primitive notions. 
The resulting deductive theory is known today as Peano Arithmetic. This chapter 
develops parts of Peano Arithmetic dealing with properties of natural numbers, 
fractions, and ratios. 


Throughout Part I, we assume that only the primitive Dedekind—Peano notions 
and axioms are given and that nothing else about any kinds of numbers or their 


'“God created the natural numbers, all else is work of man,” said Kronecker. 
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properties are known. All familiar notions, like addition, will be formally introduced, 
and their properties will be derived from the axioms. 


2.2 The Dedekind—Peano Axioms 


The three primitive Dedekind—Peano notions are: “natural number,’ “1,” and 
“successor,” where the successor of a natural number n is denoted by S(). 
The five axioms involving these primitive notions are the following. 


The Dedekind-Peano Axioms. The natural numbers satisfy the axioms: 


1. 1 is a natural number. 
2. Every natural number 7 has a unique successor S(7) which also is a natural 
number. 
. Lis not the successor of any natural number. 
4. No two distinct natural numbers have the same successor (i.e., for all natural 
numbers m,n, S(m) = S(n) implies m = n). 
5. Induction: If P is a property of natural numbers such that 


iss) 


a. | has property P, and 
b. whenever a natural number has property P so does its successor, 


then all natural numbers have property P. 


Mathematicians found it remarkable that all known properties of natural numbers 
can be derived from the Dedekind—Peano Axioms. 
We define: 


2:=S(1), 3:=S(2), 4:=S@), 5:=S(4), 6:= S(5) 
7:= S(6), 8:=S(7), 9:=S(8), 10:=S(9), etc, 


adopting the usual decimal notation as a shorthand to replace long formal expres- 
sions of the form “S(--- S(S(1))---).” 


Notational convention. Natural numbers will be denoted by lowercase Roman 
letters such asa, b,c,m,n, p, Xx, y, z, without or with subscripts and/or superscripts. 
Quantifiers involving these variables will be assumed to range over natural numbers. 
Thus “for every m there exists n” stands for “for every natural number m there exists 
a natural number n.” 


Problem 52. 3 # 5. 


Problem 53. No natural number is its own successor: S(n) # n for any n. 
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Problem 54 (Converse of Axiom 3). Every natural number other than | is the 
successor of some natural number, i.e., ifn # 1 thenn = S(m) for some m. 


At this point, expression such as 1 + 3 or (5 + 6) - 7 or statements like 3 < 5 
cannot be used; such expressions do not even make sense yet, since the operations 
+ and - and the relation < have not been defined. 


2.3. Addition, Order, and Multiplication 
Addition 


Definition 55. The sum m + n of two natural numbers m and n is defined “by 
induction on n” as follows (for any m): 


1. m+1 := S(m), and 
2.m+S(n) := S(m+n). 


In other words, define m + 1 to be S(m) (this is the case n = 1), and once m + n is 
defined, define m+ S(n) to be S(m-+n). This defines the sum of any two numbers.” 


Problem 56. 2+ 2 = 4. 
Problem 57. n+ 1 = 1+7n foralln. 


Problem 58. Addition (as defined above) is associative: 
m+(n+ p)=(m+n)+4 p. 


[Hint: Use induction on p.] 
Problem 59. Addition is commutative: m +n =n +m, for allm and n. 


Problem 60. Cancellation law for addition: Ifm + p =n-+ p, thenm =n. 


Order 


Definition 61. Define m < n if and only ifm = m+ p for some p. Also, write 
m>nforn <m. 


?This can be done more rigorously using the method of definition by primitive recursion due to 
Dedekind, covered in the last section of this chapter. 
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The next few results can be proved without induction. 

Problem 62. n < S(n) for all n. 

Problem 63. 1 <n for all n; that is, there are non, p such thatn =n + p. 
Problem 64. For alln, either 1 <n or1 =n. Also, there isnon withn < 1. 
Thus | is “the least natural number” (less than all other natural numbers). 
Problem 65. m <n if and only if either S(m) <n or S(m) =n. 
Problem 66. m +k <n-+k ifand only ifm <n. 


Recall from the previous chapter that a relation on a set is called a linear order if 
it is transitive, irreflexive, and connected on the set. 


Theorem 67. <, as defined above, is a linear order on the natural numbers. 


Proof. Transitivity of < is an easy consequence of the associative property of 
addition: If m <n andn < p,thenn = m+rand p =n +5 for someys,s. 
Hence p=n+s=(mM+r)+s=m4+(r+s) =m+t, wheret :=r-+s,so 
m < p. 

Irreflexivity is a direct consequence of Problem 63. 

Finally, to show that < is connected, define a property P as follows, writing 
“P(k)” as a shorthand for “k has property P”: 


P(k) is true if and only if foreveryn, eitherk <n, ork =n, ork >n. 


We establish < is connected on the set of natural numbers by showing that P(x) is 
true for all k, which is proved by induction: 

First, P(1) is true, as 1 is less than all other natural numbers (Problem 64). 

Next, suppose that P(x) is true. Then for every n, either k < n in which case 
S(k) <n or S(k) =n, and so P(S(k)) is true; or k = n in which case S(k) > n 
so P(S(k)) is true; or k > n in which case S(k) > k > n by transitivity so again 
P(S(k)) is true. Thus P(S(k)) is true if P(x) is true. 

Therefore, by induction P(k) is true for every natural number k. Oo 


Theorem 68 (The Well-Ordering Property). Every nonempty set A of natural 
numbers has a “least” element m € A such that for allk € A eitherm < k or 
m=k, 


Proof. We prove the equivalent statement that if A has no least element then A must 
be empty. So suppose that A does not have a least element. 

Let P be the property of being less than every member of A, that is, a natural 
number v has property P if and only ifn <k forall k € A. 

First, since 1 is less than all other natural numbers and A has no least element, so 
1 ¢ A. Hence 1 has property P, again since 1 is less than all other natural numbers. 

Next, suppose that n has P. Then S(n) ¢ A, since otherwise S(7) would be the 
least element of A: for any k € A we haven < k, so by Problem 65 S(n) < k or 
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S(n) = k. Hence S(n) has P: for any k € A,n <k,so S(n) <k or S(n) =k (by 
Problem 65), but we cannot have S(n) = k since S(n) ¢ A, so S(n) < k. Thus we 
have shown that if m has P then S(n) has P. 

By induction, every natural number has property P, so A is empty. Oo 


Remark. This theorem is actually equivalent to the general induction axiom. It is 
said to be phrased in the language of second order arithmetic, since, unlike most 
other results of this chapter, it talks about all sets of natural numbers. 


Problem 69. [fm > n there is a unique k such thatm =n + k. 


Definition 70 (Subtraction). If m > n, define m —n to be the unique k with 
m=n+k. 


Multiplication 


Definition 71. The product m -n of two natural numbers m and n is defined by 
induction on 7 as follows (for any m): 


1. m-1 := m, and 
2.m-S(n) := (m-n) +m. 


We write mn form-n. 
Problem 72. 2-3 = 6. 


Problem 73. Multiplication (as defined above) is distributive over addition: 
m(n + p) =mn+mp. 


[Hint: Use induction on p.] 


Problem 74. Multiplication is associative: 


m(np) = (mn)p. 


[Hint: Use induction on p.] 

Problem 75. Multiplication is commutative: mn = nm, for allm and n. 
Problem 76. Cancellation law for multiplication: If mp = np thenm =n. 
Problem 77. mp < np if and only ifm <n. 

Problem 78. m < 2m. 

Notation. We write n” for nn. 


Problem 79. m? <n? if and only ifm <n. 
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Definition 80. Define n to be even if and only if n = 2m for some m. Define n to 
be odd if and only ifn = 1 orn = S(2m) for some m. 


Problem 81. For every n, either n is odd or n is even but not both. Moreover, n is 
even if and only if S(n) is odd, and n is odd if and only if S(n) is even. 


Problem 82. 7 is even if and only if n? is even, and n is odd if and only if n? is odd. 
Theorem 83. There do not exist m,n such that m? = 2n?. 


Proof. Let A := {m| m? = 2n? for some n }. The result will follow if we show 
that A is empty, so we assume A is nonempty and derive a contradiction. By the 
Well-Ordering Property, fix a least member m € A. Then we can fix p such that 
m* = 2p’. Then p* < m? (Problem 78), hence by Problem 79, p < m. Also, 
since m? is even, so m is even by the last result. Hence m = 2q for some g. So 
2q-2q = 2p’, or p? = 2q. So p € A. But this is impossible since p < m and m 
is the least member of A. Oo 


Remark. In this proof, we had to avoid number theoretic properties such as reduced 
fractions, gcds, relatively prime numbers, etc., which are not available to us at this 
point. 


2.4 Fractions and Ratios 


Definition 84. A fraction is an ordered pair of natural numbers (m,n). 


Thus N x N is the set of all fractions. For a fraction (m,n), m and n are called the 
numerator and denominator, respectively. 


Definition 85 (Equivalent Fractions). We say that the fractions (m,n) and (p,q) 
are equivalent, and write (m,n) ~ (p,q) if and only if mq = np. 


Problem 86. (mk,nk) ~ (m,n). for allm,n,k. 


Problem 87. ~ is an equivalence relation on the set N x N of all fractions, and so 
N x N is partitioned into ~-equivalence classes. 


Problem 88. Find the equivalence classes [(1, 1)], [(3, 1)], and [(2, 4)]. 


m 
Definition 89. — denotes the ~-equivalence class of the fraction (m,n): 
n 


m 
n 


= [(m,n)] = {(p.9)| (p.9) ~ (mn) f- 
Such an equivalence class of fractions is called a ratio (or positive rational): 


pisaratio if and only if p = — forsomem,n. 
n 
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Thus the collection of all ratios is identical to the partition determined by the 
equivalence relation ~ (equivalence of fractions). 


Ratios will be denoted by lowercase Greek letters such as p, 0,7, a, B, y,&,, and €, 
and quantifiers involving these variables will be assumed to range over ratios. Thus 
“for every p there exists o” really means “for every ratio p there exists a ratio 0.” 


Note. The fraction (m,n) should be distinguished from the ratio *. The fraction 


(m,n) is simply an ordered pair, and therefore is a member of N x N. The ratio 7 


is the set of all fractions equivalent to the fraction (m,n), so ™ is an entire set of 


fractions, and thus is a subset of Nx N. 


Problem 90. Explain what is wrong with the claim: 


nN 
SS TN. 


1 
m 
Problem 91. p = — if and only if (m,n) € p. 
n 


Problem 92.” = 2 ifand only if (m,n) ~ (p,q). 
nq 


2.5 Order, Addition, and Multiplication of Fractions 
and Ratios 


Order for Fractions and Ratios 


To “compare” two fractions (m,n) and (p,q), we can (by Problem 86) find 
corresponding equivalent fractions (mq,nq) ~ (m,n) and (np,nq) ~ (p,q) with 
a “common denominator” 1g, and compare just the numerators. 


Definition 93. Define (m,n) < (p,q) if and only if mq < np. 
Problem 94. Jf (m,n) < (p,q) and (p,q) < (r,s), then (m,n) < (r,s). 


Problem 95. Given fractions (m,n) and (p,q), exactly one of the conditions 


(m,n) < (p,q), (m,n) ~ (p,q), (m,n) > (p,q), 


is true. 

Problem 96. If (m,n) ~ (m’,n’), (p,q) ~ (p’,q’), and (m,n) < (p,q), then 
(m',n') < (p',q’). 

Thus if a fraction in one class is less than a fraction in another class, then the same 


is true for all pairs of representatives from the two classes. Hence the following is 
well defined: 
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Definition 97. Define p < o if and only if there are m,n, p,q with p = 


m 
—,o= 
n 


P gad (m,n) < (p,q). 
q 


Problem 98. <, as defined in the last definition for ratios, is a linear order on the 
set of ratios (i.e., transitive, irreflexive, and connected). 


Addition and Multiplication of Fractions and Ratios 


To add two fractions (m,n) and (p,q), we can as before take the corresponding 
equivalent fractions (mg,nq) ~ (m,n) and (np,nq) ~ (p,q) with the common 
denominator nq, and then add the numerators. For multiplication, the numerators, 
and separately the denominators, are simply multiplied together. 


Definition 99 (Addition of Fractions). The sum of two fractions is defined as 


(m,n) + (p,q) = (mq +np,nq). 
Problem 100. [f (m,n) ~ (m’,n’) and (p,q) ~ (p',q’), then 
(m,n) + (pq) ~ (m'.n') + (p'.q’). 
Thus the class of the sum depends only on the classes to which the summands 


belong, making the following definition for addition of ratios well defined: 


Definition 101 (Addition of Ratios). The sum of two ratios p = 7 ando = 5 is 
defined as 


m mq+n 
pto= rae = a P 
n q nd 


Definition 102 (Multiplication of Fractions). The product of two fractions is 
defined as 


(m,n)- (p,q) := (mp.nq). 
Problem 103. Jf (m,n) ~ (m',n'\ and (p,q) ~ (p’,q’), then 
(m,n) + (p,q) ~ (m',n')-(p',q’). 


Thus the class of the product depends only on the classes to which the factors 
belong. Hence the following is well defined. 
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Definition 104 (Multiplication of Ratios). The product of two ratios p = 7 and 
o= a denoted by p-o or po, is defined as 

mp 

ng- 


mp. 
n oq 


2.6 Properties of Addition and Multiplication of Ratios 


m n m+n m n . . 
Problem 105. +-= and — < — ifand only ifm <n. 
P P 


P pp 
Problem 106 (Commutative Laws). 9 +o =0+ p, and po = op. 


Problem 107 (Associative Laws). (0-+0)+t = p+(o+T), and (pa)t = p(t). 
Problem 108 (Cancellation Laws). [fo +t =o +1 orif pt = ot, thenp =o. 
Problem 109 (Distributive Law). p(o + tT) = po + pt. 

Problem 110. p < p+. 

Problem 111. [fp <o, then there is a unique & such that p + & =o. 

Corollary 112. p < o ifand only ifo = p + & for some (unique) &. 


Definition 113 (Subtraction). If o < o, define o — p to be the unique & with 
pt+é&=o. 

Problem 114. p < o ifand only ifp +t <o +t. ifand only if pt < ot. 
Problem 115 (Identity and Reciprocal). p- + = pand™- 4 =}. 

Problem 116. For any p,o there is a unique € such that +p =o. 

Definition 117 (Division). o/ denotes the unique & such that & - p = o. 
Corollary 118. (o/p)p =o. 

Problem 119. Jf p; < 0, and pz < 09, then p, + p2 < 0, + 02 and p\ pz < 0)0>. 


Problem 120 (Difference of Squares). [fa < 8 so that B — a is defined, then 
B? =a? + (B—«a)(B +. @), where o? stands for o +o. 


2.7 Integral Ratios and the Embedding 
of the Natural Numbers 


Definition 121. A ratio p is said to be integral if p = = for some m. 


Problem 122. p is integral if and only if (m, 1) € p for some m. 
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Problem 123. uae integral if and only ifm = nk for some k. 
n 


We will now see that the integral ratios form a subset of the ratios which is 
structurally identical, or “isomorphic,” to the natural numbers in the following sense: 
There is a one-to-one correspondence between the natural numbers and the integral 
ratios which preserves the operations of addition and multiplication as well as the 
order relation (Problem 125). Such a bijection is called an isomorphism. 


m 
Problem 124. 7s - if and only ifm = n. Thus the mapping n +> - is a 


bijection from the set of natural numbers onto the set of integral ratios. 


Problem 125 (Isomorphism of Natural Numbers with Integral Ratios). For any 
m,n: 


oe dee m+n m n m:n 4 m ae if d / if “ 
= ; —= , and — <— ifandonlyif m <n. 
ie 1 i 4 1 cA - 


Problem 126. The integral ratios satisfy the five Dedekind—Peano axioms when 


1 
¢ 1 is interpreted as Y and 


S 
- § (=) is interpreted as = 


At this point, the natural numbers and the integral ratios become interchangeable 
since all the properties of the natural numbers listed in the initial sections are 
possessed by the integral ratios. 

Therefore, we throw away the natural numbers* and use the corresponding 
integral ratios in their place. The old natural numbers are not used directly anymore, 
and so we now deal with only one type of numbers, namely the ratios, which include 
the “new natural numbers” (really the integral ratios) as a subset. 

This process is known as embedding the natural numbers into the ratios. 


Definition 127 (New Meaning for the Natural Number Symbols). With the old 
natural numbers thrown out, the integral ratio . will now be denoted simply by 
the letter n and called the natural number n (similarly for other lowercase Roman 
letters). Not only lowercase Roman letters now denote the new natural numbers 


(integral ratios) by default, but also any other symbol previously used for a natural 
number will now denote the corresponding new natural number. 


For example, the symbol | now stands for the integral ratio +, the symbol 2 for the 
integral ratio z, etc. The resulting notational ambiguity is not a real problem, as the 
intended interpretation can be determined from context. 


3Phrase of Edmund Landau [47]. 
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This allows us to mix symbols that were previously assigned to different types, 
and “n + p” and “n - p” now become valid terms. But we remind the reader 
again that lowercase Roman letters will denote the new natural numbers (really 
the integral ratios), and lowercase Greek letters will continue to denote arbitrary 
ratios. Fractions will no longer be used. 


Problem 128. a m/n. Also, sinceo-1 =o = 1-0, so0/1 =o andoa/o = 1. 
n 
Finally, o(1/o) = 1. 


2.8 The Archimedean and Fineness Properties 


Problem 129 (The Archimedean Property for Ratios). For any p,o there is n 
such thatnp > o. 


Problem 130. For any p, there exists 0 > p, and also there exists t < p. 


Problem 131 (Density). [fp < o, then there is t such that: 
perso 


The last two results express the fact that the ratios form a dense linear order without 
end points. 


Definition 132. We say that the pair L, U is a Dedekind partition of the ratios if L 
and U are nonempty sets forming a partition of the ratios such that every ratio in L 
is smaller than every ratio in U, that is, op < o forallo ¢ Lando € U. 


For example, if L := {o| p < l} andU := {p| p > 1}, then L,U forms a 
Dedekind partition of the ratios. 


Problem 133. /f L,U is a Dedekind partition of the ratios, then L is “downward 
closed under <” meaning that if p € L and p' < p then p' € L, and similarly U is 
upward closed under >. 


The following property, which we call the Fineness Property for ratios, is closely 
related to the Archimedean property.’ It will be used in the next section and in the 
next chapter when we study Dedekind partitions in detail. 


Theorem 134 (Fineness Property for Ratios). [f L,U is a Dedekind partition of 
the ratios, then for any € there are p € Lando € U such that o — p < €, that is, 
o<pt+e. 


4The notion can be defined for (the positive elements of) any ordered field, where it will hold 
if and only if the field is Archimedean. A Dedekind partition L,U satisfying the condition of 
Theorem 134 is sometimes called a Scott cut. 
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Remark. Like Theorem 68, this is a result of second order arithmetic, since, unlike 
most other results of this chapter, it quantifies over sets of ratios. 


Proof. Let € be given. Fix a € L and 6 € U. By the Archimedean property fix a 
natural number n with n > 1/a@ andalson > 1/e. Then 1/n <a, andso 1/neé L. 
Also 1/n < €. There is k such that k/n > 6 (by the Archimedean property again), 
and so there is k with k/n € U, hence by the Well-Ordering property we can fix 
the least natural number m such that m/n € U. Thenm # 1 since 1/n ¢ U. 
Hence m = p+ 1 for some p. Put p = p/n ando = m/n. Theno ¢€ U and 
since m is the least natural number for which m/n € U, so p = p/n € L. Finally, 
o=ptl/n<pte. Oo 


2.9 Irrationality of 2 and Density of Square Ratios 


Definition 135. We write p? for pp. A ratio o is said to be a square ratio if o = p” 
for some p. A ratio o is said to be a nonsquare ratio if it is not a square ratio, i.e., if 
there is no p such that o = p?. 


For example, 1 is a square ratio since 1 = 17, but 2 is a nonsquare ratio by 
Theorem 83: 


Problem 136. There is no p such that p? = 2. 
Problem 137. p < o if and only if p? < 0°. 
The following says that the square ratios are “dense” in the set of all ratios: 


Theorem 138 (Density of Square Ratios). Given p < a, there is B such that p < 
PP <a; 


Proof. Let p < o, and put L := {y| y? < p} andU := {y| y? > p}. Then L, U is 
a Dedekind partition by Problem 137, so by the fineness property we can fix a € L 
and 6 € U with B —a < (o — p)/(2(o + 1)). We can assume f < o + 1 (since 
otherwise we could have replaced 8 by o + 1/2), andso B +a < 26 < 2(0 4+ 1). 
Hence by Problem 120: 


p< fp’ =a° + (B—a)(B +a) < pt (B—a)(2(o +1) <p+(o—p) =o. 
oO 


Corollary 139. If p* < 2, then there is o > p with p* < o? < 2. Similarly, if 
p” > 2, then there is 0 < p with p* > 0? > 2. 


Corollary 140. L := {po | p? < 2} andU := {p| p* > 2} form a Dedekind 
partition of the ratios with L having no largest element and U having no smallest 
element. 


In the last corollaries, we could obviously replace 2 by any nonsquare ratio. 


Like the square ratios, the nonsquare ratios are also dense in the set of all ratios: 
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Corollary 141. Given p < o, there is some nonsquare ratio t such that p< tT <o. 


Proof. Let p < o. Then p/2 < 0/2, so p/2 < B* < o/2, or p < 2B? < o for 
some f. But 26? is a nonsquare ratio, as otherwise (267) /B* = 2 would be a square 
ratio. Oo 


Remark. Although /2 does not exist (as a ratio), we do have arbitrarily close 
approximations to it both from below and from above: Given any €, we can 
apply the fineness property to the Dedekind partition L := {o | p* < 2} and 
U := {p | p* > 2} to getp € Lando € U witho —p < e. Since we 
expect /2 (whatever it may be) to lie between p and o, we can regard p and o 
as approximations differing from the target /2 by an amount less than e€. 


Problems Using Concepts from Abstract Algebra 


The following problems are meant for students with prior exposure to abstract 
algebra. 


Problem 142. The ratios form an abelian group under multiplication. 


Problem 143. Generalize the fineness property for the positive elements of an 
ordered field. Then show that the positive elements of an ordered field has the 
fineness property if and only if the field is Archimedean. 


Our method of going from the natural numbers to the ratios is a basic method in 
algebra in which one embeds a given commutative cancellative semigroup A into 
a group constructed from a pairs of elements of A and forming a quotient. The 
semigroup we started with was N with the operation of multiplication, but addition 
could have been incorporated as well. 


Problem 144. Construct the integral domain Z of signed integers from NXN, where 
(m,n) is identified with (p,q) if and only ifm + q = n+ p, by defining addition 
and multiplication appropriately. 


The following result of Dedekind shows that the Dedekind—Peano axioms charac- 
terize the natural numbers with the successor function up to isomorphism: 


Problem 145 (Dedekind). Jf S:N — N with 1 € N, S:N > N with1eé N, 
and if both structures satisfy the Dedekind—Peano axioms, then there is a unique 
bijection h from N onto N which preserves | and the successor functions, that is 


such that h(1) = 1 and h(S(n)) = S(h(n)) for alln € N. 
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2.10 Recursive Definitions* 


Recall that we had “defined” addition of natural numbers by the following recursion 
equations: 


m+1:= S(m), and m+ S(n):= S(m+n). 


But this is not an explicit definition! We took it for granted (as was done in the work 
of Peano) that a two-place function + (the mapping (m,n) m+n) satisfying the 
above equations exists, without giving any rigorous justification for its existence. 
Similarly, multiplication of natural numbers was “defined” by recursion equations 
without proper justification. 

Dedekind introduced a general method, known as primitive recursion, which 
provides such justification. It assures the existence and uniqueness of functions 
which are defined implicitly using recursion equations having forms similar to the 
ones for addition and multiplication. 

We will formulate and prove a general version of Dedekind’s principle of 
recursive definition, from which the existence and uniqueness for the addition and 
multiplication functions can be immediately derived. 


Principles of Recursive Definition 


The following Basic Principle of Recursive Definition is perhaps the simplest yet 
very useful result for defining functions recursively. 


Theorem 146 (Basic Principle of Recursive Definition). [f Y is a set,a € Y, and 
h:Y — Y, then there is a unique f:N — Y such that 


fC) =a, and f(ta+1)=h(f()) foralln EN. 


Informally, this says that given a € Y and h:Y — Y, we can form the infinite 
sequence (a, h(a), h(h(a)),...). 


Proof. First note that the uniqueness of the function f can be established by an easy 
and routine induction, so let us prove existence. 

Let J, := {1,2,...,.n} = {k €e N| 1 < k < n} denote the set of first n 
natural numbers. The proof uses functions u: I, — Y having domain I), i.e., finite 
sequences from Y of length n (with varying 7). 

Let us say that a function u is partially h-recursive with domain I, if u: I, > Y, 
u(1) = a, and u(k + 1) = h(u(k)) for all k with 1 < k <n. 

We first prove by induction that for every n € N there is a unique partially 
h-recursive u with domain J,,. 
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Basis step (1 = 1): Let v:{1} — Y be defined by setting v1) = a. Then v 
is partially h-recursive with domain /;. Moreover, if u,u’: 1; — Y are partially 
h-recursive functions with domain /,, then u(1) = a = w’(1), sou = w’ since 1 is 
the only element in their domain J; = {1}. So there is a unique partially h-recursive 
v with domain /,, establishing the basis step. 

Induction step: Suppose that n ¢€ N is such that there is a unique partially 
h-recursive v with domain J,, (induction hypothesis). We fix this v for the rest 
of this step, and define w:/,4,; — Y by setting w(k) := v(k) fork < n and 
w(k) := h(v(n)) ifk =n +1. Then w is easily seen to be partially h-recursive with 
domain /;,4 1. Moreover, if u, u’: I,41 — Y are partially h-recursive with domain 
T,+1, then the restrictions u |, and u’ };, are partially h-recursive with domain /,,, 
so they must be identical by induction hypothesis, i.e., u(k) = u’(k) for 1 <k <n. 
In particular, u(n) = u'(n), so u(n + 1) = A(u(n)) = h(u'(n)) = u'(n + 1), which 
gives u = u’. Thus there is a unique partially -recursive w with domain J,4), which 
finishes the induction step. 

Thus for each n there is a unique partially -recursive function with domain /,,; 
let us denote this function by u,,. 

Now define f:N — Y by setting: 


First, f(1) = a since u,(1) = a. Next, the restriction of u,+; to J, equals uy, 
(by uniqueness, since the restriction is partially h-recursive), sO U,+1(1) = Uy (7). 
Hence f(a + 1) = un4i(a +1) = A(un41()) = h(un(n)) = A(f(n)). Thus f 
satisfies the recursion equations of the theorem. Oo 


To handle functions of multiple variables, the following theorem is used. 


Theorem 147 (General Principle of Recursive Definition). For any g:X — Y 
andh:X x Nx Y — Y, there is a unique function f:X x N— Y such that for all 
x€XandneéN: 


t(x,l)= g(x) and f(x,n+1)=A(x,n, f(x,n)). 


Here f is being defined by recursion on the second variable n, that is, n is the 
variable of recursion ranging over N, while x is a parameter ranging over the set X. 
This is the most general form of recursive definition, where both the parameters 
(in X) and the values (in Y) come from arbitrary sets. 


Proof. The proof is essentially the same as that of Theorem 146, since the additional 
parameter does not play any significant role in the recursion. The details are left as 
an exercise for the reader. Oo 


Theorem 148 (Course of Values Recursion). Let Y be a nonempty set and Y* 
denote the set of all finite sequences (strings) of elements from Y. Given any 
G:Y* + Y there is a unique f:N — Y such that 
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f(a) = GU f(k)| & <n)) foralin EN. 
Denoting the empty string by e, this means that f(1) = G(e), f(2) = GQ fC))), 
£3) = GU FQ), f2))), ete. 
Proof. Let hg: Y* — Y™* be the function defined by 


hg) :=u-G(u), ie. he ((uy,...,Un)) = (u1,---, Un, G(u)) . 


Here u~ y = u * (y) denotes the string obtained from the string u by appending the 
element y € Y, so that len(u~y) = len(u) + 1. The Basic Principle of Recursive 
Definition (Theorem 146) gives a unique function ¢: N > Y* with 


o(1) = he(e), and o(n+1)=he(o(m))  foralln EN. 


Now note that #(7) is a finite sequence of length n for every n, and put f(n) := 
o(n)(n) = the last coordinate of the finite sequence ¢(n). Oo 


The form of recursion in the above theorem generalizes to transfinite ordinals, where 
it is called transfinite recursion (see Theorem 622 and Theorem 650). 


Primitive Recursion 


We start with a special case, which is an immediate corollary of Theorem 147. 


Theorem 149 (Primitive Recursion for Two-Place Functions). Given a one- 
variable function g:N — N and a three-variable function h:N?> — N, there is 
a unique two variable function f:N — N such that for allm,n € N: 


fim, 1) = g(m), and f(m, S(n)) = h(m,n, f(m,n)). 


The result of this theorem is often expressed by saying that the function f is 
obtained from the function g and h by primitive recursion. 


Proof. This is simply Theorem 147 with X = Y =N. oO 


We can now give a full justification for our original recursive definition of addition, 
by showing that the two-place function + can be obtained from the successor 
function by primitive recursion as follows: 

Let g = S be the successor function, and let h be the function defined by 
h(m,n, p) = S(p). Applying the last theorem with these g and h gives a two-place 
function f satisfying 


f(m, 1) = S(m), and fim, S(n)) = S(f(m,n)). 
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But these are the same as our original recursion equations for defining addition, as 
is easily verified by writing m + n for f(m,n): 


m+1= S(m), and m+ S(n) = S(m+n). 


Once we have justified the addition function, we can use it to obtain the multiplica- 
tion function ((m,n) +» mn) by primitive recursion. 


Problem 150. Prove that the multiplication function can be obtained from the 
identity function and the addition function using primitive recursion, verifying that 
it gives our original recursion equations for defining multiplication. 


The most general version of the primitive recursion principle, which is again an 
immediate corollary of Theorem 147, is formulated as follows: 


Theorem 151 (The General Principle of Primitive Recursion). Given a (k — 1)- 
place function g and a (k + 1)-place function h on N, there is a unique k-place 
function f onN such that for all x1, X2,...,X~ € N: 


F(x1,---, Xk-1, 1) = g(%1,...,Xe-1), and 
F(X1,-- + Xk—-1, SK) = hy, . Xe, FOC... Xk)). 


Proof. This is Theorem 147 with X¥ = N*‘~! and Y =N. Oo 


As before, the function f in the above theorem is said to be defined by primitive 
recursion from g and h. 


After obtaining the addition and multiplication functions, one can keep applying 
primitive recursion repeatedly to define more and more functions on N. Essentially 
all commonly used functions, such as exponentiation, the factorial function, the gcd 
function, and so on, can be obtained via primitive recursion. 


Problem 152. Define the factorial function (one-place) as well as the exponentia- 
tion function (two-place) from the multiplication function using primitive recursion. 


Problem 153. What familiar single-variable function is defined using the following 
primitive recursion equations? 


fQ) =1 and f(S(m)) = him, f(m)), where h(m,n) := nS(m). 


Remark. Principles of primitive recursion, such as Theorem 151, are results of 
second order arithmetic which involve quantification over functions of natural 
numbers: Functions are defined implicitly by assertions of the form “there is a 
unique function satisfying such and such recursion equations.” This is unavoidable 
in the Dedekind—Peano system (N, 1, S). However, if + and - are also added as 
primitives to obtain the extended system (N, 1,S,+,-), then primitive recursion is 
no longer necessary and functions such as exponentiation can be explicitly defined. 
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The reason for this is that + and - have sufficient power to express the notion 
of finite sequences (represented in a coded form as natural numbers), and so one 
can essentially replicate the process given in the proof of Theorem 146 to produce 
explicit definitions. 


Chapter 3 
Dedekind’s Theory of the Continuum 


Abstract This chapter constructs the real numbers from the rational numbers using 
the method of Dedekind cuts and discusses properties of general linear continuums, 
such as the Intermediate Value Theorem, in the process. 


3.1 Introduction 


Modern Set Theory was born in late nineteenth century primarily due to the 
work of Richard Dedekind and Georg Cantor. Among many remarkable things, 
they independently found two distinct methods for rigorously constructing the real 
numbers from the ratios or rational numbers. Here we will follow Dedekind’s 
method,! whose central idea is the geometric intuition of a linearly ordered 
continuum.” The size of a continuum proved to be a difficult problem, and it 
dominated a large part of twentieth century set theory. 


3.2 Linear Continuum in Geometry 


The idea of a linear continuum is embodied in geometric notions such as a line, 
segment, or a ray. The points of a ray are ordered naturally if we declare that for 
points P and Q on aray that P precedes Q (symbolically P < Q) if and only if 
P is between the initial point of the ray and Q (using “betweenness” as a primitive 
notion), as in: 


'See Stoll [76] or Suppes [77] for Cantor’s method based on Cauchy sequences of rationals. See 
also the remarks on Cantor’s method at the end of this chapter. 


2 Also known as a linear continuum, or an ordered continuum, or simply a continuum. 
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Classical axioms of geometry ensure that the relation expressed by “P precedes 
QO” on the points of the ray satisfies two properties as follows. 


1. Axiom of Order. The relation “P < Q” on the points of a ray is a transitive 
relation satisfying the law of trichotomy, i.e. < is an ordering of the points of the 
ray. 

2. Axiom of Order-Density. [f P < Q then there is R such that P < R< Q. 


These axioms are two necessary conditions for a linear continuum, but mathemati- 
cians since Pythagoras knew (see Sect. 3.3) that they are not sufficient to ensure a 
linear continuum. Until Dedekind, however, it was not clear what exactly is needed 
to capture the intuitive notion of a linear continuum. 


Analytic Geometry: Modeling the Ray by Ratios 


Analytic Geometry uses a correspondence between points and numbers (or n-tuples 
of numbers for n-dimensions, called coordinates) to transform geometric problems 
into problems of algebra and analysis (and back). If the points of a line or a ray 
correspond to a system of numbers, then the system of numbers in question must 
satisfy the two axioms above. 

Note that the ratios ordered by magnitude satisfy the two axioms above. 
Moreover, ratios are used to measure lengths of line segments and for all practical 
purposes suffice in this role. Thus one can think of a correspondence between the 
points of an open ray and the ratios, i.e., use the ratios as the system of numbers to 
assign “coordinates” to the points of the ray. This is done in such a way that if two 
points P and Q on the ray correspond to the ratios p and o, respectively, then (a) P 
precedes Q if and only if p < o, and moreover (b) if o = p + A, then the length* 
of the segment PQ equals the ratio A. 


3.3. Problems with the Ratios 


Even though the ratios are sufficient for all direct measurements of lengths in 
practice and even though the ratios appear to provide a system of numbers adequate 


This implies the important additivity property for lengths of segments: If P, Q, R are points on a 
line with Q between P and R then Len(PR) = Len(PQ) + Len(QR). However, this does not 
imply that geometric line segments are a priori associated to lengths in an invariant fashion. The 
fact that physical line segments have lengths (rigidity) is essentially empirical. See Carnap [7], An 
Introduction to the Philosophy of Science, especially Chaps. 6-9, for more details. 
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for analytic geometry, problems arise in their theoretical use in geometry and 
algebra—problems which indicate that the system of ratios is inadequate for its 
purpose. 


Ratios Are Inadequate for Measuring Lengths 


A famous Pythagorean result states that the hypotenuse of a right-angled isosceles 
triangle is incommensurable relative to its legs: 


Problem 154. Use the Pythagorean theorem to show that if the length of each of 
the legs of a right-angled isosceles triangle is measured by the ratio p, then there is 
no ratio o measuring the length of the hypotenuse. 


In particular if each of the legs has a length of 1, then there is no ratio which gives 
the exact length of the hypotenuse, even though ratios can approximate the length 
of the hypotenuse with arbitrarily small errors. 


Ratios Are Inadequate for Analytic Geometry 


A consequence of the above problem of measuring lengths is that if we use ratios as 
the system of coordinates for analytic geometry, it can sometimes fail to represent 
points of intersection. For example, it is a theorem of geometry that a ray originating 
at the center of any circle must intersect it at a unique common point. However, in 
the picture shown below, 


A 


vA 


if both legs AB and OB of the right triangle OAB have length 1, then the point 


of the ray OB intersected by the circle is not represented by any ratio. 
Ratios Are Inadequate for Solving Algebraic Equations: 
Failure of the Intermediate Value Theorem 


We have earlier seen that the equation x? = 2 has no rational solution, even though 
approximate solutions can be found in the ratios with arbitrarily small errors. More 
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general algebraic equations face similar problems, due to the lack of a standard tool 
called the Intermediate Value Theorem (IVT), which guarantees existence of roots 
in the case of real numbers. 

To discuss the Intermediate Value Theorem, we need the notion of a continuous 
junction defined on an order. The reader may already be familiar with continuous 
functions as encountered in elementary calculus. Roughly speaking, F is continu- 
ous, or equivalently the value F(x) depends continuously on x, if “small changes 
in x produce small changes in F(x).” This vague description will be made precise 
in the context of general orders, but for simplicity, we will restrict our attention to 
orders without endpoints. 

If a is in the domain of the function so that F(a) is the value of the function at the 
input a, then continuity of the function at a amounts to the following: If we desire 
that the values of the function should not differ from F(a) by more than a certain 
small amount, say by requiring that the values of F(x) remain above a value p and 
below a value g where p < F(a) < q, then we can always find an interval in the 
domain containing the point a, say (r,s) with r < a < s, such that throughout this 
interval (r, s) the values of the function remain within the prescribed limits—i.e., we 
have p < F(x) < q forr < x <. This leads to the following precise definition: 


Definition 155 (Continuous Function). Let X be an order without first or last 
elements. F: X — X is called continuous if for any w € X and any n,f € X 
with 7 < F(w) < ¢, there area, B € X witha < w < B such that for all € € X, 
a<&<Boy7< F(&) <6. 


Problem 156. Show that the squaring function p +> p? defined on the set of ratios 
is continuous. 


{Hint: Use Problem 137 and Theorem 138.] 


Definition 157 (Intermediate Value Theorem, or IVT). An order X (without first 
or last elements) is said to satisfy the VT (Intermediate Value Theorem) if whenever 
F:X — X is continuous, a < f, and y lies strictly between F(a) and F() (ie., 
either F(a) < y < F(B) or F(B) < y < F(q)), then F(&) = y for some & with 


a<é&<f. 


Problem 158. Show that the set of ratios with their usual ordering does not satisfy 
the IVT. 


The IVT is the main tool for formalizing and establishing results which claim, 
roughly, that if two continuous curves “cross,” then they must have at least one 
common point of intersection. It is a workhorse for guaranteeing existence of roots 
in many algebraic equations. 
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3.4 Irrationals: Dedekind’s Definition of the Continuum 


While the geometric axioms of order and order-density (and practical use of ratios in 
measurement) appear to provide an adequate correspondence between the points of a 
ray and the ratios, Dedekind realized that we assume a more fundamental underlying 
continuity property when we believe that “two continuous crossing curves must 
intersect,’ or even that a ray originating at the center of a circle must intersect it at a 
unique common point. 


Dedekind Cuts 


Dedekind’s method of isolating this continuity property is to partition or “cut” a 
given ordering into two nonempty pieces with one piece completely preceding the 
other. Formally: 


Definition 159. A Dedekind cut in an ordering X is a partition of X consisting two 
nonempty disjoint sets L and U such thatx ¢ L,y eU > x < y, ie., every 
member of L precedes every member of U, as pictured below. 


L U 
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In other words, all elements of U are upper bounds for L, all elements of L are 
lower bounds for U, as wellasL A@AU,LNOU =@,andLUU =X. 
If L, U is a Dedekind cut, exactly one of the following four possibilities hold: 


1. Both L has a largest element and U has a smallest element. In this case we call 
the cut a Dedekind jump, or simply a jump. 

2. L does not have a largest element but U has a smallest element. In this case the 
smallest element of U is viewed as a “limit” of the elements of L, and is called 
the (unique) boundary of the cut. 

3. L has a largest element but U does not have a smallest element. In this case the 
largest element of L is viewed as a “limit” of the elements of U and is called the 
(unique) boundary of the cut. 

4. Neither L has a largest element nor U has a smallest element. In this case we call 
the cut a Dedekind gap, or simply a gap. 


A cut as in case (2) or case (3) is called a cut with a unique limiting boundary, or 
simply a boundary cut. 
We now briefly discuss each type of cut. 
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Jump 


Case (1). This case, the possibility of a Dedekind jump, is ruled out in the presence 
of order-density. In fact, order-density is equivalent to being “jumpless,” i.e., an 
ordering is order-dense if and only if no Dedekind cut for it is a jump. The ratios, 
e.g., are order-dense and no Dedekind cut over them will be a jump. We will 
therefore not consider this case anymore, and assume that all orderings in this 
chapter will be order-dense and hence will have no jumps. 


Boundary Cut 
Cases (2) or (3). If o is any ratio, put 


Lo ={plp<o}, Us ={plozo}; Lo ={elp so}, UL = {plp>o}. 


Then L,,U, form a Dedekind cut over the ratios, and so does L’,U’. The cuts 
L,,U, and Li, U} are essentially equivalent, since both correspond to the ratio o: 
For both these cuts the ratio o is the boundary of the cut. 


Gap 


Case (4). For the ratios ordered by magnitude, put 
L= {pl p° <2}, and U={p|p’ > 2} 


This is a Dedekind cut over the ratios which is a gap. This follows from the results 
of the previous chapter; recall the density of square ratios. 

Another example of a gap is provided if we remove a single fixed point on an 
open ray: The remaining set of points on the ray breaks apart into two pieces L and 
U, forming a gap. 

Thus for an order-dense ordering, any Dedekind cut is either a boundary cut with 
a unique boundary (cases (2) or (3)), or is a gap (case (4)). 


Problem 160 (Density of Gaps). Prove that for the ratios ordered by magnitude, 
if p < o then there is a gap between p and o, i.e., there is a Dedekind cut L,U for 
the ratios such that (a) p € L, (b)o € U, and(c) L,U isa gap (L has no maximum 
and U has no minimum). 


Dedekind’s Definition of a Linear Continuum 


As we saw earlier, Dedekind found that the root cause behind the inadequacy of the 
ratios lies in the fact that the ratios have lots of gaps. 
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The hypotenuse of an isosceles right triangle is incommensurable relative to its 
legs because there is no ratio whose square is two, causing the gap L = {p| p* < 2} 
and U = {p| p? > 2}. 

The point of intersection between two geometric curves which appear to cross 
may lack coordinate representation by ratios because the “location of the crossing” 
may correspond to a gap in the ratios. 

The IVT for ratios fails for the same reason. 

For geometry, we thus postulate that in addition to the axioms of order and order- 
density, the points of a ray or a line must satisfy: 


Axiom of Continuity. The ordered set of points of a ray has no gaps. In other 
words, if all the points of the ray is partitioned into two disjoint nonempty sets 
L and U with all points of L preceding all the points of U, then the Dedekind 
cut L, U is a boundary cut, i.e., there is a point of the ray which is the boundary 
of the cut. 


Finally, we have the definition for a linear continuum. 


Definition 161 (Dedekind). An ordering is order-complete (or simply complete) if 
it has no Dedekind gaps. A linear continuum is an ordering with at least two points 
which is order-dense and order-complete.* 


Thus an ordering is a linear continuum if and only if (a) it has at least two points, 
and (b) every Dedekind cut is a boundary cut. 

Anybody familiar with “limits and continuity of real functions” as studied in 
elementary calculus will recall examples of removable discontinuity, as in the 
function x (x? — 1)/(x — 1) atx = 1. 

One of Dedekind’s simple but fundamental intuition was this: For an order 
to be a continuum, an order must be “continuous,” and so must not have such 
discontinuities. By not allowing gaps, Dedekind’s definition of continuum precisely 
avoids discontinuities of this type and achieves continuity. 


Problem 162. Prove that an ordering without first or last elements forms a linear 
continuum if and only if it satisfies the IVT. 


[Hint: Given a < b ina linear continuum and f(a) < c < f(b) with f continuous, 
let L be the set of all x such that x < y for some y € [a,b] with f(y) < c, and 
let U be the complement of L. Then L, U is a Dedekind cut, a € L, b € U, and if 
z= max L orz = minU then f(z) = c.] 


Once we postulate the axiom of continuity for the geometric ray, we see that 
the ratios are unable to label all the points of the ray, and infinitely many points on 
the ray (the “irrational points” on it) do not get labeled by any ratio at all. (This is 


4Order completeness will be studied in a more general setting in Sect. 8.2, where we will see that 
completeness is equivalent to the /east upper bound property (see Problem 518). 

5Following Dedekind, all rigorous axiomatizations of geometry, first by Hilbert and later by Tarski 
and Birkhoff, postulate this property as the Axiom of Continuity. 
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because the ratios, ordered by magnitude, have an abundance of Dedekind gaps.) To 
see this, suppose that 0 < P, is a correspondence between the ratios and certain 
points of the ray preserving order, so that 0 < op if and only if P, precedes P,. 
We then say that the point P, is labeled by or represented by the ratio 0. Now if 
P,) precedes P,, then 09 < po, so there is a Dedekind gap L, U of the ratios with 
oo € Land po € U. Let A be the set of points on the ray which precede P, for 
some o € L, and let B be the set of points of the ray which are preceded by P, for 
some p € U. Since L, U form a Dedekind gap of the ratios and since 0 < P, is an 
order-preserving correspondence, A has no last point and B has no first point. Hence 
by the Axiom of Continuity there must be a point R on the ray which is neither in 
A nor in B, and so R is not represented by any ratio. 

We can thus divide the points on the ray into two disjoint sets: (a) Each point 
on the ray that does not correspond to any ratio is called an irrational point, while 
(b) the points that correspond to ratios are called the rational points. Furthermore, 
the rational and irrational points on the ray are intermixed in a dense fashion: 
Between any two points on the ray, there are an infinite number of rational points 
and also an infinite number of irrational points. 

If a system of numbers has to serve as an adequate system of coordinates for 
analytic geometry, then they will need to be in one-to-one correspondence with 
all the points of the ray in an order-preserving way, and so they must satisfy the 
axiom of continuity as well. While each rational point on the ray is represented by a 
ratio, for each irrational point on the ray we are missing the “irrational number” 
to represent it, as the ratios as a number system is full of gaps. We thus look 
for “irrational numbers” to fill all these gaps—the removable discontinuities—to 
extend the system of ratios to a number system satisfying the axiom of continuity 
and adequate for analytic geometry. 

It is now crucial to notice that the points of the ray correspond to Dedekind cuts 
over the ratios: The rational points on the ray correspond to Dedekind cuts with 
boundary, while each irrational point on the ray corresponds to a Dedekind gap over 
the ratios. This natural one-to-one correspondence between the irrational points on 
the ray and the gaps over the ratios led Dedekind to define an irrational number 
simply as a Dedekind cut over the ratios which is a gap. 

Our construction will be a slight variant of Dedekind’s original one. First, notice 
that a Dedekind cut L,U over the ratios get determined by the lower set L alone, 
since U can be found from L by taking its complement. Thus instead of a pair 
L,U, we will only use L. Second, the two “rational” cuts with the same boundary 
o mentioned above are essentially equivalent; we will use the cut where the lower set 
L has no maximum. Thus our “numbers” will be just the lower parts L of Dedekind 
cuts L,U where L has no maximum (U may or may not have a minimum). 


3.5 Lengths (Magnitudes) 


We now define “length” or “magnitude” in a way so that the length of a line segment 
will consist of all ratios representing lengths shorter than the given line segment. 
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Definition 163 (Length). We say that I” is a length if and only if 


1. I’ is a set of ratios. 

2. I contains at least one ratio but does not contain all ratios. 

3. p€I ando <p> 0 € TI. (So far, the conditions say that I” forms the lower 
piece of a Dedekind cut over the ratios.) 

4. I’ does not contain a largest ratio, i.ec., op € l => do € I (0 > p). 


Lengths will in general be denoted by uppercase non-Roman Greek letters such as 
T,A,@,Y,A,8,0,7Y, and 2. 


Definition 164. For any length I", we define~I" := {p| p ZI}. 
Problem 165. The pair ',~I forms a Dedekind cut over the ratios. 


Definition 166. Given a ratio p, we define p* as 


p* := folo <p}. 


Problem 167. For any p, p* is a length. 


Definition 168. A length I" is rational if and only if [ = p* for some p. 
Otherwise, I is irrational. 


Problem 169. A length I" is rational if and only if ',~ IT is a boundary cut; and 
I is irrational if and only if ',~T is a gap. 


Problem 170. There are rational and irrational lengths. 


Definition 171. For lengths I’, A, we write 
[<A ifandonlyif [CAandr ZA. 


Problem 172. The relation < is an ordering of the set of all lengths. 


Problem 173 (Density of Rational and Irrational Lengths). Jf [ < A, then 
(a) there exists arational ® such that. < ® < A, and (b) there exists an irrational 


lol 


& such that < & <A. 
Definition 174 (Addition). [+ A := {p+oa|pel,o€ A}. 
Theorem 175. I" + A is a length. 


Proof. First, we show that neither ” + A nor its complement is empty: Fix y € I’, 
y’ e~l,6 € A,and ds’ e~ A. Theny+6¢e°+A,buty’+ 8 ¢gF+A, 
since p € I’ ando € A implies p < y’ ando < 8’ andsop+o < y’+ 06’, so 
ptoFy'4+6 forallpeFandoeAsoy’+i ¢F+A. 

Next, J” + A is “closed under taking lower members”: Let ¢ € I” + A, so that 
€=y+6withy € I andé € A. Givena < ¢, puta/f¢ = t. Thent < 1 andso 
yt €lT andét € A,soyt+é6r el +A, but yr +6t = t(y +45) = tf =a, so 
ael+aA. 
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Finally, / + A has no maximum ratio: Let € ¢ + A,so¢ = y + 6 for some 
y € I andé € A. There is y’ € I" such that y’ > y. Then y’ + 6 € + A and 
yt+6>yts6=6. Oo 


Problem 176. Addition of lengths is associative and commutative. 
Theorem 177. For any p and I’, there iso such thato € I buto+p¢T. 


Proof. Given p and I’, as I~ I form a Dedekind partition, we can use the 
Fineness property (previous chapter) to findo € I andt €~T' such that t <o+ . 
Henceo + p€~I',as~T is “upward closed under >.” Oo 


Theorem 178. [ <I+ A. 


Proof. Lety € I’. Fix 6 € A. Fix t < min(y, 5). Then y = y’+7 forsome y’ < y 
andt <6.Theny’€ Tandte€ A,soyEe I+A.Solr CI+A. Next fix p € A, 
and by Theorem 177 findo € I’ such thato + p e~I’. Theno + pe¢€1' +A but 
ot+p¢T. Oo 


Problem 179. Jf [ <A, or [T =A, or I >A, _ , then 
TH+@<A+8, or F+8=A4+8, or T+ >aAs+ sa, 


respectively, and conversely. 
Theorem 180. [f° < A, then’ + & = A fora unique &. 


Proof. Let & := {a| (ABe~I)(B+a € A)}. We show that [+ 5 = A. Itis easy 
to see that 7+ & C A.Nowlet€ € A. If eI, then¢ € + & by Theorem 178. 
So assume € € ANI”. Pick t € A such that € < rt. Lett = €+p. By Theorem 177, 
findo € I such thato +p e~T’. Sinceo < 6,s0€ = a +a for some a. Now put 
B =o+ p.ThenB eT andB+a=o+pt+a=ot+at+p=lC+p=TéedA, 
soawe &.Sol=o+acel+s. Oo 


Definition 181 (Proper Subtraction). If  < A, we define A~TI to be the 
unique & such that A= "+ &. 


Definition 182 (Multiplication). [A := {po|peI,a€ A}. 
Definition 183 (Reciprocal). [~! := { p| (do €~I)(po < 1)}. 
Problem 184. [A is a length and I~ is a length. 


Problem 185. Multiplication (of lengths) as defined above is associative and 
commutative, and for any length T we have [1* = TI and 'T~! = 1*. (The 
lengths form a multiplicative “commutative group” with unity 1*.) 

Also, multiplication is distributive over addition. 


Problem 186. Jf [ <A, or T=A, or I>A, , then 
TE <AE, or FE=AE, or FE> ASE, 


respectively, and conversely. 
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Definition 187. /2 := {p| p? < 23. 
Problem 188. ./2 is an irrational length, and /2/2 = 2*. 


Problem 189 (Existence of Square Roots). Given any length I there is a unique 
length A such that AA =T. 


Problem 190 (Isomorphic Embedding of Ratios). The mapping p > p* is a 
bijection from the set of ratios onto the set of rational lengths which preserves order, 
addition, and multiplication, i.e., 


p<o@p <o*; (p+o)*=p*+o*; = (pa)* = p*o*. 


At this point, the ratios and the rational lengths become interchangeable since all 
the properties of the ratios listed in earlier sections are possessed by the rational 
lengths. 

Therefore, we throw away the ratios® and use the corresponding rational lengths 
in their place. So only one type of numbers remain, namely the lengths, which 
include the “ratios” (really the rational lengths), and therefore in turn also the 
“natural numbers” (integral lengths), as subsets. 


Definition 191 (New Meaning for Symbols for Ratios). With the old ratios 
thrown out, the rational length p* will now be denoted simply by the letter o (and 
similarly for other Greek letters). Not only do Greek letters now exclusively stand 
for rational lengths, but also other symbols that were previously used to denote 
a ratio will now denote the corresponding rational length (e.g., 2 now stands for 
what was being called 2*). Similarly, lowercase Roman letters will denote integral 
lengths. 


This allows us to mix symbols that were previously assigned to different types, and 
“n+po+ TI” and “n-p- I” now become valid terms. 


Problem 192 (Dedekind’s Theorem for the Real Continuum). The collection of 
lengths ordered by the relation < forms a linear continuum containing the ratios as 
a subset. Thus, it is an ordering which is order-dense and order-complete (has no 
Dedekind gaps), and so every Dedekind cut for the lengths is a boundary cut. 


[Hint: In a Dedekind cut of the ordered collection of all lengths into two pieces, the 
set-theoretic union of the lengths in the left piece is itself a length.] 


We now have a system of numbers (the Jengths) which can uniquely represent 
every point of the geometric open ray and serve as the basis for analytic geometry. 
The operations of addition, multiplication, and division are possible between an 
arbitrary pair of these numbers. However, we are still missing “negative magni- 
tudes,” and so “the subtraction ” — A” is defined only when I” > A. In the next 


6Phrase of Edmund Landau [47]. 
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section, we extend the system of lengths to a system of “signed lengths,” or the field 
of real numbers, in which the subtraction of two arbitrary real numbers produces a 
well-defined real number. 


3.6 The Ordered Field R of Real Numbers 


To get signed real numbers, we regard a pair of lengths (I, A) as the “signed 
magnitude” J” — A. This means that the length-pairs (I, A) and (4, ©) will define 
the same signed real if 1 + © = & + A, which of course results in a lot of 
duplication. More precisely, this condition defines an equivalence relation on the set 
of pairs of lengths and we could use the approach of forming the “quotient structure” 
by defining signed real numbers as equivalence classes. It is easy, however, to choose 
canonical representatives by considering those pairs in which the smaller member 
equals 1. Then | acts as a reference length and the magnitude of the signed real is 
determined by how much the other length of the pair exceeds 1. Thus positive reals 
are precisely the pairs of the form (J, 1) with ” > 1, and negative reals are the pairs 
(1, 7°) with I” > 1. Zero is defined as the pair (1, 1). 


Definition 193 (Real Numbers). A real number is a pair of lengths (I, A) such 
that min(’, A) = 1. The set of all real numbers is denoted by R. 


Thus a real number is an ordered pair of lengths none of which is less than | and at 
least one of which equals 1. 


Definition 194 (Zero, Negative, and Positive Reals). A real number (I, A) is 
called positive if ! > 1 (and so A = 1), and (I, A) is negative if A > 1 (and 
so I’ = 1). Define 0 := (1, 1). 

The set of positive real numbers will be denoted by R™, and the set of negative 
real numbers will be denoted by R-. 


Thus (I, A) is positive if > A, is negative if [ < A, andis zeroif [ = A. 

Definition 195. For any pair of lengths I, A, define: 

1. (IP, A) ~ (&, @), or (L, A) is equivalent to (4,0), if C+ 0=8+4+ A. 
(+rP+A,1) iff >A, 

2*C, A= 411,14+A+P) iff <A, 
(1,1) =0 ifr=A. 

Note that *(I”, A) is always a real number. We now have: 

Problem 196. For all lengths I, A, 


1. *(I, A) is the unique real number satisfying *(I’, A) ~ (I, A). 
2. (I, A) is a real number if and only if *(I, A) = (I, A). 

3. #7, A) = «(1 + 3, A+ &) for any length &. 

4. «0, T) = (1,1) =0. 
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Problem 197. (1, A) ~ (&,@) if and only if «(I, A) = *(&,@0), and so 
equivalence between pairs of lengths is an equivalence relation for which the 
mapping (I, A) +> *(1, A) is a complete invariant. 


Definition 198 (Order, Addition, Multiplication). Given real numbers (I°, A) 
and (4, ©), define 


1. Order: (M.A) <(8,0) 8 F+0<8+A. 
2. Sum: (PT, A) + (8,0) := «4+ 8,A4+ 60). 
3. Product: (P, A): (8,0) := «E+ AO,FO+ As). 


Uppercase Roman letters A, B,C, X, Y, Z, etc. will denote real numbers. 


Problem 199. The relation < defined above is an order on R. Also, A € Ris 
positive if and only if 0 < A, and A is negative if and only if A < 0. 


Problem 200. /f A, B € Rare positive, then so are A+ Band A- B. 


Problem 201 (Additive Inverse). For each real number A = (I, A), define 
—A := (A,I’). Show that for any real number A, 


1. —A is a real number. 

2. A+ (—A) = 0, and —(—A) = A. 

3. A is positive if and only if —A is negative. 
4. A =—A ifand only if A = 0. 


Problem 202 (Isomorphic Embedding of Lengths). For each length I’, let = 
(+ 1,1). The mapping [ & T is a bijection from the set of lengths onto the 
set R* of positive real numbers which preserves order, addition, and multiplication, 
Lé., 


T<ASTI <A; T+A=TIe+A; T-A=T-A. 


At this point, the lengths and the positive reals Rt become interchangeable since all 
the properties of the lengths listed earlier are possessed by the positive real numbers. 

Therefore, we throw away the lengths’ and use the corresponding positive real 
numbers in their place. In other words, we identify the lengths with the positive reals 
Rt, anda length now means a positive real, i.e., a member of Rt. So from now on 
we deal with only one type of numbers, namely the real numbers, which include 
the “lengths” (really the positive reals) as a subset, as well as all previously defined 
types such as the ratios and the natural numbers. 


Definition 203. As subsets of R, the natural numbers will be denoted by N, and the 
ratios (positive rationals) by Q*. Thus we have: 


NcQtcR'cR. 


7Phrase of Edmund Landau [47]. 
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Also put Z := NU {O}{—A| A € N} and Q := Qt U{0}{—A| A € QT}, so: 
NCZCQCR, N=ZNR’, Qt =QnR’. 


Problem 204. Q is order-dense in R, i.e., if A < B are inR, then thereisaC €Q 
with A<C < B. 


{Hint: Use Problem 173.] 

Since positive reals are identified with the lengths and since reciprocals have 
already been defined for lengths, reciprocals of arbitrary nonzero reals are defined 
in the following way. 


Problem 205 (Multiplicative Inverse). For a positive real A = (I,1) with > 
1, define A“! := (CT + 1)~!. For A < 0, define A~! := —((—A)7!). If A = 0, we 
leave A~! undefined. Then, for any real number A # 0, 


1. AW! is anonzero real number, and A>0<& Aq! >0. 
2. A-A'=1land(A')'=A. 


We now have all the operations to develop the theory of real numbers, but the 
algebraic theory of real numbers can be derived from the properties listed in the 
following definition. 


Definition 206 (Ordered Fields). A set with an order < containing two distinct 
elements 0 and | and with two operations addition (+) and multiplication (-) is an 
ordered field if, to each A there corresponds an element —A, and for A # 0 an 
element A~!, such that for all elements A, B, C we have: 


1.4+B=B+Aand AB = BA. 
2.A+(B4+C)=(A+B)+C and A(BC) = (AB)C. 

3. A(B+C)=AB+AC. 

4,.A+0=A=A.-1. 

5. A+ (—A) = Oandif A #0 then AA! = 1. 

6. A > Oif and only if -—A <0,and A,B >O> A+ B>Oand AB > 0. 


The ordered field is called complete if the ordering forms a linear continuum. 
The following theorem is the main and central result of this chapter. 


Theorem 207 (R is a Complete Ordered Field). The set R, with the ordering < 
and the operations + and -, forms an ordered field in which the ordering relation 
< and the operations + and - extend the corresponding relation and operations 
originally defined for R*, Q*, andN. 

Moreover, R is a linear continuum (no Dedekind cut is a gap), and hence R 
satisfies the Intermediate Value Theorem (IVT). 


Problem 208. Prove Theorem 207. 


{Hints: The algebraic properties listed in the definition of ordered field are all proved 
by expressing real numbers in terms of lengths and exploiting the corresponding 
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property for lengths, making frequent use of the construct *(I”, A). For example, 
the commutative property of addition is proved as: 


(I, A) + (8,0) = *(4+8,A+0) = *(84+T,0+A) = (8,0) 4+ A), 
and the property A + 0 = A is proved, by taking A = (I, A) = *(I, A), as: 
(I, A) +0 = (FP, A) + (1,1) = *( +1, A41) = #7 A) = (FA). 


To show that R is a linear continuum, use the corresponding fact for R*.] 


Problem 209. R satisfies the Archimedean property, that is, for all x,y € R if 
x > 0 then there is a positive integer n such thatnx > y. 


A bounded closed interval is a set of the form J = [a,b] = {x €e RJ) a < x < 5} 
(a < b), whose length is defined as len(/) := b — a. A sequence (J, | n € N) of 
intervals is said to be a nested sequence if I, > I,+; for alln EN. 


Theorem 210 (The Nested Interval Property). For a nested sequence 


Kh2Dh2D-2Dh2- 


of nonempty bounded closed intervals in R, we have (\,en In 4 O. 


Proof. If I, = |dn, bn] is a nested sequence of nonempty closed intervals, then 
An < Qn41 < bn41 < by for all n. Let L := {x | x < a, for some n} and U := 
{y| y > b, for some n}. Note that we have x < y forall x € L and y € U. Also 
L has no maximum and U has no minimum. Now if M,/,, were empty, then we 
would have L UU = R, and so L,U would be a Dedekind gap in R, which is a 
contradiction. Hence M,,/, must be nonempty. oO 


This theorem is true in any linear continuum, not just R (Theorems 579 and 578). 
However, the intersection ,,/, here may contain multiple points. The following 
version ensures that the intersection contains a unique point under the additional 
restriction that the lengths of the intervals approach zero in the limit, i.e., for any 
€ > O there is n with len(/,,) < e. 


Theorem 211 (The Cauchy Nested Interval Property). /f for a nested sequence 
(I, | n € N) of nonempty closed intervals in R we have len(I,) > 0 asn — ov, 
then the intersection \,I, contains a unique point. 


Proof. By Theorem 210, ,J, 4 ©. Having p,q € Ny», with p < g would imply 
len(/;,) => q — p for all n, contrary to the assumption len(J,,) > 0. oO 


Theorem 207 and the above results are the foundations for developing “real variable 
theories” such as calculus. But such a development is beyond the scope of this text 
and belongs to the subject of mathematical analysis. 
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3.7 Additional Facts on Ordered Fields* 


We state without proof some important results about R and ordered fields. 
Examples of basic algebraic properties that can be derived from the ordered-field 
axioms of Definition 206 are: 


1A+B=A+C>B=C, A-0=0, and (—A)(—B) = AB. 

2.0< 1.Also, A#0> A? > 0, and so there is no A such that A* + 1 = O(i.e., 
the polynomial x* + 1 has no zero in an ordered-field). 

3. A<B>A<(A+B)-27' < B,s0 every ordered field is order-dense. 

4. Every ordered field contains a subfield isomorphic to Q. 


Deriving such algebraic results from the ordered-field axioms is done in elementary 
abstract algebra. The reader may wish to try to derive these results as exercises, or 
find them in standard abstract algebra texts. 

Unlike the purely algebraic properties which do not depend on order- 
completeness, the following properties need the fact that R is a linear continuum. 
Proofs use the IVT and can be found in standard real analysis texts. 


1. Every positive real number has an n-th root (n €N). 
2. Any odd degree polynomial over R has a root in R. 


Theorem 212. An ordered field is order complete (a linear continuum) if and only if 
it satisfies both the Archimedean Property and the Cauchy Nested Interval Property. 


The qualifier “Cauchy” in the theorem may be dropped, since in Archimedean fields 
the NIP (Nested Interval Property) is equivalent to the weaker property of having 
the Cauchy NIP.* 

Also, the conditions of being Archimedean and satisfying the NIP are indepen- 
dent: There are ordered fields (such as Q) which are Archimedean but satisfies 
neither of the NIPs, and there are non-Archimedean fields which satisfy both the 
NIPs.? 


Theorem 213 (Categoricity of R). [f F is any order-complete ordered field, then 
F must be “isomorphic to” R, i.e., there is a (unique) one-to-one correspondence 
x <> x’ between the elements x € F and the elements x' € R such that for all 
x,yeF: 


x<yoex<y’, (x+y) =x'+y’, and (xy =x'y’. 
This result says that R is essentially (“up to isomorphism’) the unique order- 


complete ordered field: Two such fields will have identical structural properties and 
so will be structurally indistinguishable. 


8However, there are non-Archimedean fields satisfying the Cauchy NIP in which the (unrestricted) 
NIP fails, such as the formal Laurent series field over R. 


°Examples for the second kind are given by hyperreal fields of type 7). 
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3.8 Alternative Development Routes* 


To build the field R of real numbers from the natural numbers N, three different 
paths may be followed depending on which intermediate class of numbers are built 
on the way, as shown in the following diagram. 


— + Qt —= Rt 


| | 


> Q > R 


N <—— 24 


In our development, we followed the topmost path N > Q*t > Rt = R, which 
avoids negative numbers and zero until the last step. 

Suppes [77] follows the middle route N > Qt > QR. 

Stoll [76] uses the “algebraic” bottom route N — Z — Q — R, where the 
ordered integral domain Z is first built from N. One then builds Q as the field of 
fractions of Z, a process applicable to any integral domain. 

In the last step of building R from Q, both Suppes and Stoll use an alternative way 
of building R due to Cantor, which is quite distinct from the method of Dedekind 
cuts (due to Dedekind) that we used in this text. 

Cantor’s method represents real numbers as “Cauchy sequences of rationals.” A 
sequence (¢,,) of rational numbers is a Cauchy sequence if for any € > 0, there is ak 
such that |0n—pn| < € forallm,n > k. Two Cauchy sequences of rationals (p,,) and 
(o,) are called equivalent if for any « > 0, there is ak such that |p, —o,| < € for all 
n > k. Real numbers are then defined as equivalence classes of Cauchy sequences 
of rationals, with operations on them defined by performing the operation term-wise 
on the sequences. 

Cantor’s method leads to a far-reaching generalization known as the metric 
completion, applicable to a class of spaces called metric spaces. It captures the 
intuitive idea of “filling in” the “missing spatial points” —points to which a sequence 
“tries but fails” to converge. Furthermore, it is applicable to an arbitrary ordered 
field, giving what is known as the Cauchy-completion of the field. To see how 
Cantor’s method is carried out in the general context of ordered fields, see Hewitt 
and Stromberg [30]. 

On the other hand, Dedekind’s method of cuts is based on order and captures 
the geometric notion of a linear continuum in a highly intuitive manner. Dedekind’s 
idea of continuity amounts to the condition that if the line is partitioned into two 
pieces then at least one of the pieces must contain a limit point of the other. This 
idea itself leads to a direct but far-reaching generalization—a concept known as 
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connectedness, which is a form of continuity applicable to the most general types of 
spaces called topological spaces.'° 


3.9 Complex Numbers* 


We define a complex number to be an ordered pair of real numbers. Complex 
numbers form a field (unordered) with sum and product defined as follows. 


Definition 214 (Addition and Multiplication of Complex Numbers). 


(A, B) + (C,D):=(A+C,B+D), 
(A, B)-(C, D) := (AC — BD, BC + AD). 


A complex number of the form (A, 0) is called a real complex number. 


Problem 215. The mapping A — (A, 0) is a bijection from the set of real numbers 
onto the set of real complex numbers and it preserves both the operations + and -: 


(A+ B,0) =(A,0)+(B,0), and (AB,0) = (A,0)(B,0). 


At this point, the real numbers and the real complex numbers become interchange- 
able since all the properties of the real numbers are possessed by the real complex 
numbers. 

Therefore, we throw away the real numbers!! and use the corresponding real 
complex numbers in their place. In particular, the real complex number (A, 0) will 
be denoted simply by A. 


Definition 216. i := (0,1). 


Problem 217. With our convention of using A as an abbreviation for (A, 0), prove 
that 


i?=-1 and (A,B)=A+Bi. 


One problem with the real numbers is that one cannot solve equations like 
x? + 1 = 0. Complex numbers guarantee the existence of roots for not only such 
equations but also any arbitrary polynomial equation. We have the following basic 
theorem. 


ODedekind’s condition can be used word for word to define the notion of connectedness: A 
topological space is connected if and only if whenever it is partitioned into two pieces then at 
least one of the pieces contains a limit point of the other. 


'Phrase of Edmund Landau [47]. 
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Theorem 218 (Fundamental Theorem of Algebra). Any non-constant polyno- 
mial with complex coefficients has a complex root. 


The proof of this theorem is beyond the scope of this book. A proof can be found in 
Birkhoff and Mac Lane [4], A Survey of Modern Algebra. 


Chapter 4 
Postscript I: What Exactly Are the Natural 
Numbers? 


Abstract This postscript to Part I consists of philosophical and historical remarks 
concerning the nature of the natural numbers. It contrasts the absolutist approach 
requiring absolute constructions of individual natural numbers such as those given 
by Frege, Russell, Zermelo, and von Neumann, with Dedekind’s structuralist 
approach in which the natural numbers can be taken as members of any Dedekind— 
Peano system. 


Note: In this postscript we will often use the variant convention that the natural 
numbers include 0 and so start from 0 instead of |. 


4.1 Russell’s Absolutism? 


In the last couple of chapters we outlined how the field of real and complex 
numbers, and thus essentially the entire body of traditional pure mathematics, can 
be deductively developed starting from only the natural numbers based on the 
Dedekind—Peano Axioms. However, the Dedekind—Peano Axioms do not specify 
what the natural numbers themselves really are, and thus leave the interpretation of 
the notion of natural numbers open. 

The following passage is quoted from Russell (1920) [68] to illustrate the 
problem of finding an absolute interpretation for the natural numbers. 


...Peano’s three primitive ideas—namely, “0,” “number,” and “‘successor’—are capable of 
an infinite number of different interpretations, all of which will satisfy the five primitive 
propositions. We will give some examples. 

(1) Let “0” be taken to mean 100, and let “number” be taken to mean the numbers 
from 100 onward in the series of natural numbers. Then all our primitive propositions are 
satisfied, even the fourth, for, though 100 is the successor of 99, 99 is not a “number” in the 
sense which we are now giving to the word “number.” It is obvious that any number may be 
substituted for 100 in this example. 
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(2) Let “0” have its usual meaning, and let “number” mean what we usually call “even 
numbers,” and let the “successor” of a number be what results from adding two to it. Then 
“1” will stand for the number two, “2” will stand for the number four, and so on; the series 
of “numbers” now will be 


0, two, four, six, eight, ... 


All Peano’s five premisses are satisfied still. 
(3) Let “0” mean the number one, let “number” mean the set 


and let “successor” mean “half.” Then all Peano’s five axioms will be true of this set. 
It is clear that such examples might be multiplied indefinitely. In fact, given any series 


XQ, X1,X2,X3,.--,Xny-- 


which is endless, contains no repetitions, has a beginning, and has no terms that cannot be 
reached from the beginning in a finite number of steps, we have a set of terms verifying 
Peano’s axioms. 


In Peano’s system there is nothing to enable us to distinguish between these different 
interpretations of his primitive ideas. It is assumed that we know what is meant by “0,” and 
that we shall not suppose that this symbol means 100 or Cleopatra’s Needle or any of the 
other things that it might mean. 

This point, that “0” and “number” and “successor” cannot be defined by means of 
Peano’s five axioms, but must be independently understood, is important. We want our 
numbers not merely to verify mathematical formulas, but to apply in the right way to 
common objects. We want to have ten fingers and two eyes and one nose. A system in which 
“1” meant 100, and “2” meant 101, and so on, might be all right for pure mathematics, but 
would not suit daily life. We want “0” and “number” and “successor” to have meanings 
which will give us the right allowance of fingers and eyes and noses. We have already some 
knowledge (though not sufficiently articulate or analytic) of what we mean by “1” and 
“2” and so on, and our use of numbers in arithmetic must conform to this knowledge. We 
cannot secure that this shall be the case by Peano’s method; all that we can do, if we adopt 
his method, is to say “we know what we mean by ‘0’ and ‘number’ and ‘successor,’ though 
we cannot explain what we mean in terms of other simpler concepts.” ... 

It might be suggested that, instead of setting up “O” and “number” and “successor” as 
terms of which we know the meaning although we cannot define them, we might let them 
stand for any three terms that verify Peano’s five axioms. They will then no longer be terms 
which have a meaning that is definite though undefined: they will be “variables,” terms 
concerning which we make certain hypotheses, namely, those stated in the five axioms, but 
which are otherwise undetermined. If we adopt this plan, our theorems will not be proved 
concerning an ascertained set of terms called “the natural numbers,” but concerning all sets 
of terms having certain properties. Such a procedure is not fallacious; indeed for certain 
purposes it represents a valuable generalization. But from two points of view it fails to give 
an adequate basis for arithmetic. In the first place, it does not enable us to know whether 
there are any sets of terms verifying Peano’s axioms; it does not even give the faintest 
suggestion of any way of discovering whether there are such sets. In the second place, as 
already observed, we want our numbers to be such as can be used for counting common 
objects, and this requires that our numbers should have a definite meaning, not merely that 
they should have certain formal properties. [68, pages 7-10] 
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4.2 Interpretations for the Natural Numbers 


The Frege—Russell Natural Numbers. In 1884 Frege had already built an interpre- 
tation for the natural numbers satisfying Russell’s requirements above (which was 
later re-invented by Russell himself). It is based on the natural principle of abstrac- 
tion which defines a complete invariant for a given equivalence relation by assigning 
to each object its own equivalence class. The “Frege—Russell invariant” is obtained 
by applying this principle to the relation of one-to-one correspondence between sets. 
Two sets are called equinumerous (i.e., they have the same “number” of elements) if 
there is a one-to-one correspondence between them.' Equinumerosity is easily seen 
to be an equivalence relation, and the number of elements a set A is then defined as 
the equivalence class [A] of A, i.¢., the collection of all sets equinumerous to A. 
For example, the first few Frege—Russell numbers are 


0:= [8] = {9} 
1 := [{a}] = the collection of all singletons 


2 := [{a, b}] (a, b distinct) = the collection of all doubletons, etc. 


The Zermelo Natural Numbers. In 1908, Zermelo [85] gave a definition of the 
natural numbers in his framework of axiomatic set theory as follows: 


0:=@, 1:= {O}, 2:= {{O}}, 3:= {{{OH, ..., n+ 1:= {n}, ... 


The Von Neumann Natural Numbers. In 1923, von Neumann built another interpre- 
tation for the natural numbers which has become standard in modern axiomatic set 
theories. His interpretation is as follows: 


0:=@ 

1 := {0} = {Q} 

2 := {0,1} = {D, {O}} 

3 := {0, 1,2} = {D, {O}, {D, {D}}} 


n+1:= {0,1,2,...n} =n U {n} 


‘Equinumerosity and cardinal numbers will be studied in Chap. 5. 

Problems arise with the naive Frege—Russell invariant (Chap. 20) which can only be addressed 
by using less natural approaches such as Quine’s New Foundations or Scott’s modified invariant 
(Definition 1297) in the context of ZF set theory. 
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Here every natural number 7 is defined as a simple and canonical n-element set 
consisting precisely of the smaller natural numbers, and the successor function is 
defined as S(x) := x U {x}. Von Neumann’s method also extends to the transfinite, 
giving a canonical interpretation for the ordinal numbers (which was the original 
purpose of von Neumann, see Chap. 21). 


Each of the above definitions of natural numbers (Frege—Russell, Zermelo, and 
Von Neumann) provides a valid interpretation for the three primitive notions of 
Dedekind—Peano so that in each framework the five Dedekind—Peano axioms can 
be derived as theorems. 


4.3 Dedekind’s Structuralism 


The view expressed by Russell’s comments above on finding an absolute interpreta- 
tion for the natural numbers as the “real one” is sometimes called Frege—Russell 
absolutism. This is in sharp contrast to Dedekind’s structuralism, expressed by 
Dedekind in 1888, which we now discuss. 

As illustrated by Russell’s comments above and by Zermelo and von Neumann’s 
definition of the natural numbers, there are many possible interpretations for the 
Dedekind—Peano axioms. Dedekind proved that all interpretations for the natural 
numbers which satisfy the Dedekind—Peano axioms have the same structure, or are 
“isomorphic.” This is known as the categoricity of the Dedekind—Peano axioms 
and is given in the theorem below. It suggests that there is no reason to prefer 
any interpretation over any other. Thus, according to Dedekind’s structuralism, one 
cannot take any specific interpretation of the natural numbers as the real one; rather, 
the true concept of natural number is given by the abstract common structure present 
in all interpretations which satisfy the Dedekind—Peano axioms. 


Definition 219. A Dedekind—Peano system N,1y,@ consists of a set N, an element 
1y, and a function ¢ which satisfy: 


ly EN 

o:N—>N 

. ly ¢ $[N] = ran($) 

. @ is injective 

. If P is a subset of N such that 


a. ly € P, and 
b. Forallxe N,xeP > d(x) EP, 


then P = N. 


Clearly, the five conditions above correspond precisely to the five Dedekind—Peano 
axioms, with N playing the role of N, 1y that of 1, and ¢ that of the successor 
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function. The following theorem shows that any two Dedekind—Peano systems are 
“isomorphic,” that is, they have the same structure: 


Theorem 220 (Dedekind). [f N,ly,@ and 2,19, are two Dedekind—Peano 
systems then there is a unique bijection yy: N — 82 such that y(1y) = 1g and 


w(o(x)) = O(W(x)) for allx € N. 


The function y in the theorem is the isomorphism between the two systems. The 
reader is invited to construct a proof of this categoricity theorem using the method 
of recursive definition given in Sect. 2.10. 

What we are calling Dedekind—Peano systems were called simply infinite systems 
by Dedekind himself. For a Dedekind—Peano system N, 1y,@, Dedekind uses the 
terminology that “the simply infinite system N is set in order by this function ¢” 
with 1 being the “base-element of NV.” Dedekind wrote in 1888 [11]: 


73. Definition. If in the consideration of a simply infinite system N set in order by a 
function ¢ we entirely neglect the special character of the elements, merely retaining their 
distinguishability and taking into account only the relations to one another in which they 
are placed by the order-setting function ¢, then are these elements called natural numbers 
or ordinal numbers or simply numbers, and the base-element | is called the base-number 
of the number-series N. With reference to this freeing the elements from every other 
content (abstraction) we are justified in calling numbers a free creation of the human mind. 
The relations or laws which are derived entirely from the conditions [...], and which are 
therefore always the same in all ordered simply infinite systems, whatever names may 
happen to be given to the individual elements (compare 134), form the first object of the 
science of numbers or arithmetic. [12, p. 68] 


He continues later: 


132. Theorem. All simply infinite systems are similar to the number-series N and 
consequently [...] also to one another. 


133. Theorem. Every system that is similar to a simply infinite system and therefore [... ] 
to the number-series N is simply infinite. 


134. Remark. By the two preceding theorems (132), (133) all simply infinite systems form 
a class in the sense of [an equivalence class for the isomorphism relation]. At the same time, 
[...] it is clear that every theorem regarding numbers, i.e., regarding the elements n of the 
simply infinite system N set in order by the transformation ¢, and indeed every theorem 
in which we leave entirely out of consideration the special character of the elements n 
and discuss only such notions as arise from the arrangement ¢, possesses perfectly general 
validity for every other simply infinite system {2 set in order by a transformation 6 and its 
elements v, and that the passage from N to S2 (e.g., also the translation of an arithmetic 
theorem from one language into another) is effected by the transformation y considered in 
(132), (133), which changes every element n of N into an element v of 2, i.e., into w(n). 
This element v can be called the nth element of (2 and accordingly the number 77 is itself 
the nth number of the number-series N.. The same significance which the transformation 
@ possesses for the laws in the domain JN, in so far as every element 7 is followed by a 
determinate element $(n) = n’, is found, after the change effected by wy, to belong to the 
transformation 6 for the same laws in the domain £2, in so far as the element v = wW(n) 
arising from the change of n is followed by the element 6(v) = w(n’) arising from the 
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change of n’; we are therefore justified in saying that by y, @ is changed into 6, which 
is symbolically expressed by 0 = wow, ¢ = wOw. By these remarks, as I believe, the 
definition of the notion of numbers given in (73) is fully justified. [12, p. 92-96] 


In Dedekind’s structuralist approach, the existence of natural numbers is tantamount 
to the existence of at least one Dedekind—Peano system. Dedekind observed that this 
follows from the existence of an infinite set in his sense, also called a reflexive or 
Dedekind infinite set, that is, a set for which there is an injective function mapping 
the set into a proper subset of itself.* Such an existence proof again has a much more 
structuralist flavor than the absolutist presentations of Frege—Russell, Zermelo, or 
von Neumann, where natural numbers are constructed in a unique canonical way 
with every natural number having a specific absolute definition. 

Most regular mathematicians (as opposed to set theorists or logicians) do not 
think of natural numbers to be absolute constructs as presented in the Frege—Russell, 
Zermelo, or von Neumann definitions. It is fair to say that Dedekind’s viewpoint 
above had a tremendous impact on later mathematicians such as Hilbert, and has 
overwhelmingly dominated the approach found in modern mathematics.* 


Other Mathematical Notions. The distinction between the absolutist and the 
structuralist approaches applies not only to natural numbers but also to many other 
mathematical concepts. For example, the notion of ordered pair was reduced to 
an absolutist definition in terms of sets first by Wiener in 1914 as (a,b) := 
{{a}, {b, O}}, and then again by Kuratowski in 1921 as (a,b) := {{a}, {a, b}}. 
Both definitions satisfy the characterizing criterion for the ordered pair, namely: 
(a,b) = (c,d) > a = candb = d. For a structuralist, it is the characterizing 
criterion that matters the most. 

The real numbers also can be considered either in absolutist or in structuralist 
terms. The construction of the real numbers that we presented using “Dedekind cuts 
of ratios” is an example of the absolutist approach. On the other hand, it is common 
for modern analysis texts to take a structuralist approach to the real numbers, where 
the system of real numbers is simply taken to be any complete ordered field.> Such a 
structuralist definition is sound because of the corresponding categoricity theorem: 
Any two complete ordered fields are isomorphic as ordered fields (Theorem 213). 
However, there is no simple “structuralist’” existence proof in this case, and all 
known constructions of the real numbers, either using Dedekind’s method or using 
Cantor’s method, require some “hard work.” 


>This existence result is related to the Axiom of Infinity and its equivalent forms. See the part of 
Sect. 21.5 dealing with the Axiom of Infinity where this topic is further discussed, particularly 
Theorem 1263, as well as Problem 1225. 


4The literature is large on the topics of this postscript. See, e.g., [63,71], and the chapters by 
Hellman in [72] and [49], where further references can be found. 


Even in our absolute constructions of the previous chapters for extending the system of natural 
numbers to larger and larger systems of numbers such as the ratios, the lengths, the real numbers, 
and the complex numbers, we had already used the structuralist approach by throwing away old 
entities and replacing them with “isomorphic copies” found within the new extensions. 


Part II 
Cantor: Cardinals, Order, and Ordinals 


Introduction to Part II 


This part contains the core material of the book. 

Chapters 5—10 cover cardinals, finitude, countability and uncountability, cardinal 
arithmetic, the theory of order types, dense and complete orders, well-orders, 
transfinite induction, ordinals, and alephs—almost all of which are due to Cantor. 

In addition, Chaps. 10 and 11 cover the basic facts about the Axiom of Choice 
and equivalent maximal principles such as Zorn’s Lemma, as well as well-founded 
relations and trees. 

The postscript to this part (Chap. 12) briefly presents a selection of some of the 
most elementary topics of Infinitary Combinatorics. 
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Chapter 5 
Cardinals: Finite, Countable, and Uncountable 


Abstract This chapter introduces the basic idea of cardinal numbers, comparability, 
and operations, and next covers the theory of finite sets and natural numbers, from 
which the Dedekind—Peano axioms are derived as theorems. Dedekind infinite sets 
and reflexive cardinals are also defined. It then presents the Axiom of Choice and 
contrasts it with effective choice, using the notion of effectiveness informally. The 
rest of the chapter is about countability and uncountability: It focuses on the two 
specific cardinals No = |N| and c = |R|, and gives the first proof of &>) < c¢ 
(uncountability of R). In the process, the principles of countable and dependent 
choice are encountered. 


5.1 Cardinal Numbers 


Recall that f is said to be a one-to-one correspondence between A and B if f: A > 
B is a bijection (i.e., . f is a one-to-one function mapping A onto B). 


Definition 221 (Similar or Equinumerous Sets). Two sets A and B are called 
similar, or equinumerous, written A ~,. B (or simply A ~ B) if there is a one-to- 
one correspondence between A and B. 


Problem 222. We have: (a) A ~ A, (b) A~ B => Bw A,and(c) Aw 
Band B~C => AWC. Thus equinumerosity, ~, is an equivalence relation. 


We now permanently fix—once and for all—a specific complete invariant A +> | A| 
for the equivalence relation of similarity (equinumerosity). For any set A, we call 
|A| the cardinal number of A. 


Definition 223 (Cardinal Number, Cantor). For each set A, |A| denotes its 
cardinal number and satisfies the condition: 


|A|=|B| ifandonlyif A~ B, for all sets A and B. 
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We say that @ is a cardinal number if a is the cardinal number of some set. 


Discussion. At this point, “the cardinal number of a set” is simply a primitive notion 
serving as a complete invariant for the relation of similarity of sets. The main reason 
for introducing it is that the cardinal numbers form a generalization of the natural 
numbers which extends into the transfinite. 

Historically, cardinal numbers—first introduced by Cantor in their full 
generality—were defined in two main ways, one known as the Frege—Russell 
definition and the other we call the Cantor—-Von Neumann definition. 

The Frege—Russell definition uses the natural complete invariant associated 
with the equivalence relation of similarity of sets—the quotient map given by the 
Principle of Abstraction (Theorem 42)—to define cardinals: |A| is defined as the 
equivalence class [A] of A under the similarity relation, i.e., | A] equals the collection 
of all sets similar to A. Although a natural definition, this becomes problematic as 
the “collection” of all sets similar to A is so large that it is questionable whether it 
is a legitimate collection at all (see Chap. 20). In certain formal set theories such 
as the Zermelo—Fraenkel system (ZF), such a collection does not even exist as a 
set (Problem 1296), although the definition works in some other systems such as 
Quine’s New Foundations. In ZF, a modified definition by Dana Scott, called the 
Frege—Russell—Scott definition, handles the problem by significantly reducing the 
collection which serves as the cardinal number of a set. We will discuss the Frege— 
Russell—Scott definition in Sect. 21.8 (Definition 1297). 

The Cantor—-Von Neumann definition is technically more complicated and needs 
the notions of well-orders and ordinal numbers which will be defined later. Still, the 
definition goes as follows: | A| is defined as the least (von Neumann) ordinal w such 
that A can be well-ordered with type a. So in order for | A| to exist, the set A needs 
to be well-orderable, which in turn requires a special axiom called the Axiom of 
Choice. Thus the Cantor—-Von Neumann method cannot be used to effectively define 
cardinal numbers of arbitrary sets (such as the set of real numbers R) without the 
use of Axiom of Choice. It is however the one found in Cantor’s original conception 
of the transfinite and still is the more common definition of cardinal number used 
in formal set theory with the Axiom of Choice (such as ZFC). We will present it as 
Definition 1266 in Sect. 21.4 on von Neumann ordinals. 


Definition 224 (0 and 1). We define 0 to be the cardinal number of the empty set 
and 1 to be the cardinal number of the singleton set {0}: 


0:= |, and 1 := |{0}]. 


We remind the reader that a singleton was defined as a set of the form {a}, so A is 
a singleton if A contains an element a and no other element, i.e., if there is some a 
such that forallx,x Ee ASx=a. 


Problem 225. |A| = 0 if and only if A = @ and |A| = 1 if and only if A is a 


singleton. 
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Comparison of Cardinals and Sets 


Definition 226 (Comparison of Cardinals and Sets). For sets A and B, we write 
A = B if there is a one-to-one function f: A > B (which may or may not be onto). 
We also write A X B for “not A < B.” 

For cardinal numbers @ and f, we write a < B if A = B for some sets A and B 
with |A| = a and |B| = B. 


General cardinal and set comparison, where both finite and infinite sets are allowed, 
behaves quite differently from the familiar situation of finite sets. 


Problem 227. /f |A| = a, |B| = B, and f: A — B is one-to-one but not onto, 
then which of the following statements must necessarily be true? 


(a) A= Banda<f; (b)a<fBbuta#~ Bp; (c) AX BbutnotAn~B. 


So it is quite possible that for sets A and B, A is similar to some proper subset of 
B, while at the same time B is similar to some proper subset of A. 


Problem 228. Give examples showing that the last statement is correct. 


This is very different from the case of finite sets—in fact, we will see that this is 
impossible if A or B is finite, when we formally define finite sets below. 

To analyze the general situation for two sets A and B with cardinal numbers 
a = |A| and 6 = |B\, the following four possibilities are mutually exclusive and 
exhaustive (meaning that exactly one of these holds): 


Definition 229. Let A and B be sets with a = |A| and 6 = |B|. Then exactly one 
of the following cases holds, and in each case we give a definition: 


1. Both A < B and B & A. We then say A is weakly equivalent to B, and write 
this as A ~* B, and also write a =* B. 

2. A =< B but not B = A. We then write A <x B anda < 8, and say a is less 
than B. 

3. B = A but not A = B. Here we write A > B anda > 8, and say a is more 
than B. 

4. Neither A < B nor B A. In this case we say that A is not comparable to B 
and a and f are incomparable cardinals, writing this as a||6. 


Each of these relations is invariant over equinumerosity ~, i.e., if Ad ~ A’ and 
B ~ B’, then any of the above four relations will hold between A’ and B’ if and 
only if it holds between A and B. Hence the four cardinal relations 


(1) a =* B, (2)a < B, (3)a > B, (4) al |B, 


are well defined, exactly one of which always holds between any pair of cardinals a 
and f. It is also easy to see that 6 > a@ if and only ifa < fp. 
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Problem 230. 0 < 1, andso0 # 1. 
By entirely routine arguments we have: 


Problem 231. The relation A =< B for sets and the corresponding relation a < B 
for cardinals are both reflexive and transitive. The relations A < B for sets and 
a < B for cardinals are asymmetric, and therefore irreflexive. 


The following is only slightly more interesting. 


Problem 232. If A <~ B and B = C, orif A = Band B < C, then A < C. So 
for cardinals a, B, y, ifa < B and B < y, orifa < B and B < y, thena < y. The 
relations A < B for sets and a < B for cardinals are transitive. 


Problem 233. Suppose that A and B are sets,a ¢ A, andb ¢ B. Then 


1. Aw B ifand only if AU {a} ~ BU {bd}. 
2. A = B ifand only if AU {a} = BU {b}. 
3. A < B ifand only if AU {a} ~ BU {b}. 


Can We Get Trichotomy? 


We would like to establish that the relation < is an ordering of the cardinals, and 
so we need the law of trichotomy for <. But all we have at this point is that exactly 
one of 


(1) =* B, (2)a < B, (3)a > B, (4) al |B, 


holds, which is a long way from trichotomy. 
Our goal to obtain trichotomy can be realized if, for the four conditions above, 
we can 


¢ Replace condition (1) a =* 6 by the condition a = f; and 
e Prove that condition (4), incomparability, cannot hold. 


Later, using the Axiom of Choice, we will see that condition (4) is in fact impossible. 
For now, we discuss how we can replace “a =* B” by “a = B,” i.e., how to prove 
the equivalence 


a="Boa=B. 


The implication a = 6 > a =* 6 holds trivially. The converse implication (a =* 
B => a = 8) is also true, but the proof is nontrivial. We can restate it as “weakly 
equivalent sets are equinumerous” (i.e, A ~* B => A ~ B), a result called the 
Cantor—Bernstein Theorem or Schréder—Bernstein Theorem. 
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Theorem (Cantor—Bernstein). If A <x B and B = A, then A ~ B. Therefore, the 
relation < defined on the cardinals is antisymmetric. 


This theorem will be established in the next chapter. However, it is instructive for 
the reader to attempt a proof at this point. 

Proving that < is connected requires (in fact is equivalent to) the Axiom of 
Choice, and will be given much later in Theorem 719. 


5.2 Sum and Product of Cardinal Numbers 


Problem 234 (Disjoint Copies of sets). Given any sets A, B there exist disjoint 
sets A’, B’ with A~ A’ and B ~ B’. 


[Hint: Take A’ = {0} x A and B’ = {1} x B.] 


Problem 235 (Uniqueness of Sum). Jf A ~ A’, B ~ B’, and ANB =@ = 
A’ BY’, then (AU B) ~ (A'U B’). 


Problem 236. Given cardinals a and £ there is a unique cardinal y such that there 
are disjoint sets A and B with |A| = a, |B| = B, and|AU B| = y. 


Definition 237 (Sum of two Cardinal Numbers). Given cardinal numbers a and 
B, the unique cardinal number y whose existence is guaranteed by Problem 236 is 
called the sum of a and B and is denoted by a + B. 


Problem 238. The sum of cardinal numbers is an associative and commutative 
operation with 0 as the identity. In other words, for any cardinals a, B, y: 


a+(6+y=(@+A)+y a+tfp=Bt+a, a+0=a. 


Hence it follows that: a + (B +1) = (a+ B)+1. 


Problem 239. For cardinals a, B, we have B = a + 1 if and only if there is a set A 
with |A| = a and x ¢ A such that B = |A U {x}}. 

Problem 240. [fa, 6 are cardinals, then a < B if and only if B = a + y for some 
cardinal y. 

Problem 241. Ifa, f are cardinals, then: 

1a+1= 6 +1 ifand only ifa = B. 


2.a@+1<£6+41 ifandonly ifa < B. 
3.a+1< 6+4+1 ifandonly ifa < B. 


{Hint: Use Problem 233.] 
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Problem 242 (Uniqueness of Product). A ~ A’ and B ~ B' => AxB ~ A’xB’. 


By the last Problem, given a = |A| and 6 = |B|, the product af := |A x B| is well 
defined. 


Definition 243 (Product of two Cardinal Numbers). Given cardinals @ and 6, 
the product af is the unique cardinal number y such that y = |A x B| for some 
A, B with |A| = a and |B| = B. 


Problem 244. Cardinal product is associative and commutative, with | as the 
identity. In other words, for any cardinals a, B, y: 


a(By) = (aB)y, aB = fa, la=a. 


Note that A x B can be naturally partitioned into the pairwise disjoint family 
(A x {b}| b € B) indexed by B: 


Ax B= \|JAx {5}, 


beB 


where (A x {b}) ~ A for all b € B. In other words, with a = |A| and 6 = |B|, 
AxB is the union of 6-many pairwise disjoint sets, each having cardinality a. Hence 
af may be regarded as the result of “repeatedly summing a, repeated 6 times” (see 
Definition 355 and Problem 357). 


Problem 245. The distributive law for cardinals holds: a(B + y) = aB + ay. 
Hence it follows that a(B + 1) = aB +a. 


5.3 Finite Sets and Dedekind Infinite Sets 


The concepts of finitude and infinity have been used in mathematics since 
antiquity—tecall Euclid’s proof that there are an infinity of primes—but precise 
definitions were given only in relatively modern times. 


First Definition: Finite Sets as Inductive Sets 


Our first definition of finite sets closely matches the intuition of “being similar to 
the set {1,2,...2} for some natural number n,” but does not presuppose the notion 
of a natural number. The key idea here is the principle of induction: The definition 
will roughly be that the finite sets are precisely those sets which satisfy induction, 
in a sense to be seen below. 
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Definition 246 (Inductive Families). Let A be a set. We say that a collection C of 
subsets of A is an inductive family over A if it satisfies 


1.@MeEC; 
2. Ee€CandaeA> EU{asecC; 


Informal discussion. Here are some informal examples of inductive families: 


1. For any set A, the power set of A, P(A) is an inductive family over A. 

2. The family of all bounded subsets of R is inductive over R (a subset F of R is 
called bounded if we have —a < x < a for some real number a). This inductive 
family does not include R itself as a member. 

3. The family C of subsets of N not containing any arithmetic progression is 
inductive over N (E£ contains an arithmetic progression if there are a,b € N 
witha + bn € E for alln € N). Here N ¢ C but many infinite sets, e.g., the set 
{1,4,9,...} of perfect squares, are members of C. 


Informally using the word “finite,” note the following “test for finitude”’: 


1. If A is finite, then any inductive family C over A contains A as a member. Reason: 
© € C by the first clause of the definition, so by the second clause C contains all 
singleton subsets of A, then all the doubletons, and so on, picking up every finite 
subset of A, and so A itself, in the process. 


2. If A is not finite, then there is an inductive family over A which does not contain 
A as amember, namely the family F 4 of all finite subsets of A. (F4 is an inductive 
family over A since @ is finite, and for any finite set E the set E U {x} has at 
most one more member and so is finite.) 


We now invert this test to get our first formal definition of finite sets. 


Definition 247 (Finite Sets). A set A is finite, or inductive, if A is a member of 
every inductive family over A. A is infinite if A is not finite, i.e., if there is an 
inductive family over A which does not contain A as a member. 


An immediate corollary of the definition is the following principle which is very 
useful for establishing properties of finite sets. 


Theorem 248 (The Principle of Induction over Finite Sets). Let P be a property 
of sets such that (a) the empty set © has property P and (b) if any set E has property 
P, then so does the set E U {x} obtained from E by adjoining any single element 
x. Then every finite set has property P. 


Proof. Let A be any finite set. Define C to be the collection of all those subsets of 
A which has property P. By the given conditions C is an inductive family over A, 
and so A € C since A is finite. Hence A has property P. Oo 


Another useful fact is the following. 


Theorem 249. The empty set is finite. If A is finite then so is A U {b} for any b. In 
particular, every singleton is finite. 
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Proof. The empty set is a member of every inductive family, so is finite. 

Suppose A is finite. Let C be an inductive family over A U {b}. We show that 
AU{b} EC. 

Put Cy := {EF € C| E C A}. Then Cz, is an inductive family over A, and 
so A € Cy since A is finite. Hence A € C. But C is inductive over A U {b} and 
bE AU {bh}, soAU {hb} EC. o 


The following results about finite sets are proved by induction over finite sets 
(Theorem 248) and using Theorem 249. Let us give a typical example. 


Problem 250. Any subset of a finite set is finite. 


Proof. We prove the result by induction on finite sets (Theorem 248). 

First, any subset of @, being equal to @ itself, is finite. 

Next, assume that every subset of F is finite (induction hypothesis). Then given 
any x, a subset S of E U {x} is either a subset of FE and so is finite by induction 
hypothesis, or S has the form S = T U {x} for some T C E and so is finite by 
Theorem 249 since T is finite by induction hypothesis. oO 


Problem 251. The image of a finite set under any function is finite. If A is finite and 
B ~ A then B is finite. If A is finite and B = A then B is finite. 


Problem 252. A set A is infinite if and only B < A for every finite set B. 
Problem 253. /f A and B are finite then so is AU B. 


[Hint: Use induction on B. The induction step consists of showing that if A U B is 
finite (induction hypothesis) then so is A U (B U {x}) for any x.] 


Problem 254. /f A is finite then so is its power set P(A). If A and B are finite then 
so is the Cartesian product A x B. 


Problem 255. A finite union of finite sets is finite. That is, if C is finite and every 
member of C is a finite set, then UC is finite. 


[Hint: In the induction step, use the fact that U(C U {E}) = (UC) U E.] 


Problem 256 (Transitive Closure, Frege—Russell). Let R be a relation. We say 
that y is an R-successor of x if xRy. A set A is called R-hereditary if for all x, y, 
x € AandxRy => y € A. Define a relation Rx as follows: xR, y if and only if y 
is a member of every R-hereditary set containing all R-successors of x. Then Rx is 
the least transitive relation containing R, i.e., 


IRC Rg, te, xRy => xRay. 
2. Rx is a transitive relation. 
3. If T is any transitive relation with R C T, then R, CT. 
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Galileo found that, paradoxically, a “small” part of a collection may in fact be of 
the same size as the whole collection. He took the natural numbers as the whole 
collection and then formed a strictly “smaller” part of the whole by taking only 
the perfect squares, which are rather sparsely distributed in the natural numbers 
since they become rarer among larger numbers. But, strangely enough, a one-to- 
one correspondence between the whole and the strictly smaller part is established 
by n < n’, showing that the size of the part is equal to the size of the whole, not 
smaller! 

Dedekind turned Galileo’s paradox into a precise definition of infinity: The 
Dedekind infinite sets are precisely the ones showing this “paradoxical” behavior. 
This is our second definition of finite and infinite sets. 


Definition 257 (Dedekind). A set A is said to be Dedekind infinite or reflexive if 
A ~ B for some proper subset B ¢ A, ie., if there is a function f: A — A which 
is one-to-one but not onto. A set will be called Dedekind finite or non-reflexive if it 
is not Dedekind infinite. 

A reflection of a set A is a one-to-one map of A into a proper subset of A. 


In our first version (Definition 247), we gave a direct natural definition of finite 
sets, and infinite sets were then defined indirectly—as sets which are not finite. In 
Dedekind’s definition, the opposite is done: A simple direct definition of infinite 
sets is given, and finite sets are defined indirectly—as sets which are not Dedekind 
infinite. 


Problem 258. /f A ~ B then A is Dedekind finite if and only if B is. 


Problem 259. Let A C B. If A is Dedekind infinite then so is B. Equivalently, if B 
is Dedekind finite then so is A. 


[Hint: A reflection f: A + A canbe extended to f*: B > B by setting f*(x) = x 
for allx € BNA.] 


Corollary 260. Let A < B. If A is Dedekind infinite then so is B. 


Proposition 261. Suppose that x ¢ A. Then A is Dedekind infinite if and only if 
AU{x} <A. 


Proof. Suppose first that A is Dedekind infinite. Let f: A — A be one-to-one but 
not onto, fix y € AX f [A], and extend f to f*: AU{x} > A by setting f*(x) = y. 
Then /* is injective, so A U {x} < A. 

For the converse, assume that A U {x} = A, and let f: A U {x} — A be an 
injection. Then f(x) € Ax f[A] since f is injective. Hence f }4:4 > Aisa 
reflection, and so A is Dedekind infinite. oO 


Corollary 262. If A is Dedekind finite and A ¢ B then A ~ B. 
[Hint: B < A would imply A U {b} = A for some b € BNA.] 
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Proposition 263. If A is Dedekind finite then so is A U {b}. 


Proof. Let B := A U {b} be Dedekind infinite. We show that then so is A. 

This follows from Problem 233, but let us give a direct proof. 

Let f: B > B beareflection and fixc € B\f[B].Ifb ¢ ran(f) then f[B] C A 
and we are done by Proposition 261, so let us fix a € B with f(a) = b. Now 
modify f to a function g: B — B by redefining the value of f at a to be c, ie., let 
g(a) =c and g(x) = f(x) forall x 4 a. Then g is one-to-one with g[B] C A, so 
A is Dedekind infinite by Proposition 261. oO 


This gives the following basic result by induction over finite sets. 


Theorem 264. Any finite set is Dedekind finite. 


Finite Cardinals 


Definition 265 (Finite Cardinals). A cardinal ju is called a finite cardinal if uw = 
|A| for some finite set A; otherwise, jz is called an infinite cardinal. The set of all 
finite cardinals will be denoted by J, and for each cardinal x, J, will denote the set 
of all finite cardinals less than x: 


J := {|A|: A is finite }, and Je t= {we Tl w<k}. 
Problem 266. 0 € Jandifu¢ Jthnu+1leJ Sole J1l+ile/J, etc. 


Moreover if 4,v € J thenu+v€ Jandy ¢ J. 


Theorem 267 (Principle of Induction for J). Suppose that K is a set of cardinal 
numbers such that0 € K anduwe K > w+1¢€K.ThenJ CK. 


[Hint: To get J C K, show that |A| € K for every finite set A by induction over 
finite sets. Use the fact that if x ¢ A then |A U {x}| = |A] + 1.] 
We have the following series of corollaries to Corollary 262: 


Corollary 268. [f v,« are cardinals with v finite, thenv <k @ k =v+y for 
some nonzero cardinal j. In particular, v < v +1 forallv € J. 


Corollary 269. [f v,« are cardinals with v finite, thnv <k @k =v-+lor 
kK>v+l. 


Corollary 270 (Strong Trichotomy for Finite Cardinals). If v,« are cardinals 
with v finite, then exactly one of v < Kk, V = kK, or v > k holds. 


{Hint: Use induction on v.] 


Corollary 271. [f v,« are cardinals with v finite, then k < v + 1 if and only if 
K<vorkK=v. 


5.4 Natural Numbers and Reflexive Cardinals 87 


Corollary 272. [fv is an finite cardinal, then the set of all cardinals smaller than 
v is a finite set of cardinality v. That is, |J,| = v forallv € J. 


5.4 Natural Numbers and Reflexive Cardinals 


In Part I, we defined real numbers in terms of ratios, and ratios in terms of 
natural numbers, but the natural numbers themselves were left undefined—we only 
assumed they are primitive entities satisfying the Dedekind—Peano axioms. We now 
officially define the natural numbers as the nonzero finite cardinals, and, from this 
definition, derive the Dedekind—Peano axioms, i.e., prove them as theorems. This 
gives an interpretation for the natural numbers, or a model for the Dedekind—Peano 
axioms, in terms of our current primitive of cardinal numbers. 


Definition 273 (The Natural Numbers N). A natural number is a nonzero finite 
cardinal. The set of all natural numbers is denoted by N: 


N := JX{0} = {lA]: A is finite, A 4 O}, 


We also define the successor map S by S(k) := « + 1 (for any cardinal x). 


Theorem 274. The successor function S' restricted to the set J of finite cardinals 
maps J bijectively onto the set N of natural numbers. 


[Hint: To show that S is injective, use Problem 241.] 


Corollary 275. We have N ~ J withN ¢ J and JXN = {0}. Hence J and N are 
Dedekind infinite, and so infinite. So A < N for every finite A. 


Proving the Dedekind—Peano Axioms 


Problem 276. 1 € N, and ifk € N then S(k) € N. Moreover if a, B € N then 
a+ fBeNandapeN. 


It is now routine to verify that the above interpretation of the natural numbers 
satisfies the Dedekind—Peano Axioms: 


Theorem 277 ((N, S, 1) Models the Dedekind—Peano Axioms). With the natural 
numbers N, the successor map S, and the cardinal number | as defined above, all 
five Dedekind—Peano axioms are satisfied. 


Proof. Since | € N, the first axiom holds. The second axiom holds because if « 
is a nonzero finite cardinal then so is S(k) = x + 1. To verify the third axiom, 
suppose, if possible, that 1 = S(«) for some x € N. Then S(0) = S(k),so0 =k 
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(since S is injective), which is impossible since 0 ¢ N. The fourth axiom again 
follows from injectivity of S. Finally, the induction axiom is essentially the same as 
Theorem 267. oO 


Problem 278. Addition and multiplication of natural numbers as defined recur- 
sively in the Dedekind—Peano system coincide with the corresponding operations 
for cardinal numbers restricted to N. Similarly, the ordering relation defined for the 
natural numbers via the Dedekind—Peano system coincides with the cardinal “less 
than” relation restricted to N. 


[Hint: Let + denote cardinal addition, and +’ denote addition as defined in the 
Dedekind—Peano system via the recursion equations m +’ 1 = S(m) and m +’ 
S(n) = S(m +’ n). By the associative law, cardinal addition satisfies the same 
recursion equations, and a routine induction on the second variable shows that + 
and +’ coincide. Multiplication is handled similarly. 

The ordering relation <’ in the Dedekind—Peano system was defined as m <’ 
n << n=m-+k forsomek € N. But the same criterion has been already established 
for finite cardinals, so the two relations coincide. ] 


Now that we have proved the Dedekind—Peano axioms with the natural numbers 
defined as the nonzero finite cardinals, the entire theory developed in Part I becomes 
available to us as a corollary. In particular, the principles of recursive definition, 
as well as the complete ordered field R of real numbers, its subsets Q (rational 
numbers) and Z (integers), and their general properties can be officially used. For 
example, by Theorem 68 we have: 


Corollary 279 (The Well-Ordering Property). Every nonempty subset of N con- 
tains a smallest element. 


Dedekind Infinite Sets, Reflexive Cardinals, and Xo 


Theorem 280. A set A is Dedekind infinite if and only if N = A. 


Proof. If N = A then A is Dedekind infinite since N is. 

For the other direction, suppose there is a one-to-one reflection h: A > A, and 
fix a € ANh|[A]. The main idea behind the proof is this: Since h is a reflection, we 
have the strictly decreasing sequence of sets 


A 2 A{A] 2 AfA[A]] 2 Afh[h[A]]] 2 
Since / is injective, from a € AN/A[A] we get h(a) € h[A]~A[A[A]], and so 


h(h(a)) € A[h[A]|~h[A[A[A]]], etc. This makes the elements a, h(a), h(h(a)),... 
all distinct, as shown in the figure below. 
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We thus get an injective mapping f:N — A if we set f(1) = a, f(2) = h(a), 
F(3) = h(h(a)), etc. We now formalize this idea into a rigorous proof. 

By the basic principle of recursive definition, there is f:N — A such that 
fC) = aand f(n + 1) = h(f(m)) for all in € N. We claim that f is injective, 
ie., f(m) 4 f(n) form # n. By trichotomy, it suffices to show that for any n, 
f(m) 4 f(r) for allm <n. We prove this by induction on n. 

For n = 1 this is vacuously true. Assume that f(m) 4 f(n) for all m <n 
(induction hypothesis). We show that f(m) # f(n + 1) for all m <n + 1. Let 
m<n+1.Ifm = 1, then f(m) = fl) =a ¢ ran(h) while f(n + 1) = 
h(f(a)) € ran(h), so f(m) # f(n+ 1). If m > 1, thenm = k + 1 for some 
k EN. Since m <n+1, we getk +1 <n+1,andsok < n. By induction 
hypothesis, f(k) # f(n), so h(f(k)) 4 h(f(u)) (since h is one-to-one), i.e., 
SK +) 4 fat) or f(m) 4 f(n + 1) as desired. Oo 


Definition 281 (Reflexive Cardinals). A cardinal x is called a reflexive cardinal if 
k = |A| for some Dedekind infinite set A. 


Corollary 282. Every reflexive cardinal is an infinite cardinal. 


We now define Xo to be the cardinal number of the set N of natural numbers. (No is 
called aleph-nought, or aleph-null, or aleph-zero.) 


Definition 283. Xo := |N|. 
By Corollary 275 we have: 


Corollary 284. 8) = |J|, so &8o + 1 = No. Also No is a reflexive cardinal and 
therefore an infinite cardinal. Thus n < Xo for all finite cardinals n € J. 


We have the following characterizations of a reflexive cardinal. 
Proposition 285. For any cardinal x the following conditions are equivalent: 


1. k is reflexive. 
2. Ro <K. 
3 K+1=k. 


(Hint: | = 2 by Theorem 280. For 2 => 3, note that if 8) < « thenk = a + No for 
some cardinala, sox +1 = (a+) +1=a+(No+ 1) =a+RNo = x. Finally, 
3 = | follows from definition (or use Proposition 261).] 


Corollary 286. Xo is the smallest reflexive cardinal. 
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Can an Infinite Set be Dedekind Finite? 


Since no Dedekind infinite set is finite, we have the following picture. 


Dedekind Infinite Sets 


(Reflexive) 


So the question is: Is every infinite set Dedekind infinite? That would imply that 
the region marked by “?” in the diagram above is empty and so the two notions of 
finitude would coincide: A set will be Dedekind finite if and only if it is finite. Or: 
Are there sets which are infinite but Dedekind finite? 

If there were such a set A, then {1,2,...,”} =< A for all n, i.e., A has finite 
subsets {a1, d2,...,4n} with n distinct elements for every n, yet N < A, i.e., there is 
no infinite sequence (a), d2,...,dn,...) of distinct elements from A. As we do not 
have a clear intuition about such sets, we can perhaps show that this is impossible— 
resulting in the “clean solution” that the two notions of finitude coincide. 

Let us recall the proof of this when A was Dedekind infinite (Theorem 280): The 
presence of a reflection h: A — A and an element a € A~N/A|[A] allowed us to define 
a sequence of distinct elements as aq) = a, dz = h(a1), a3 = h(an), etc. Notice 
that this infinite sequence is specified uniquely in terms of the reflection h and the 
element a. 

When A is infinite but not known to be Dedekind infinite, no such reflection is 
available but we can try to argue as follows. Fix a; € A, then pick az € Ax {ay}, 
a3 € Ax{aj, a2}, etc, and in general choose a,4; € AX{d1,d2,...,d,}. Since A is 
infinite, a finite set {a),d2,...,@,} cannot exhaust A, so it will always be possible 
to choose a,,4 1, and the induction seems to go through. 

However, the problem here is that—unlike the Dedekind infinite case where a 
reflection was available—there is no mechanism to specify a,+4) uniquely in terms 
of 1, a2,...,@,. Thus the argument requires infinitely many arbitrary choices—a 
process that can only be formalized using the Axiom of Choice. 


5.5 The Axiom of Choice vs Effectiveness 


Recall from Sect. 1.6 that a partition is a family of pairwise-disjoint nonempty sets, 
and that we say P is a partition of A if P is a partition whose union equals A. 
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Definition 287 (Choice Set). A choice set for a partition P is a set containing 
exactly one element from each set of the partition P. More precisely, C is a choice 
set for Pif C C UP and C / E isa singleton for every EF € P. 


We now state the Axiom of Choice, henceforth referred to as “AC.” 
AC (The Axiom of Choice). Every partition has a choice set. 


Whether AC is a “self-evident mathematical principle” or not was initially a matter 
of controversy, although many mathematicians do find it acceptable. However, the 
introduction of AC as a separate explicit axiom (by Zermelo [84, 86]) eventually 
helped to mitigate the debate, since now one could sharply distinguish between 
mathematical results which use the AC and the ones which do not, and so 
mathematicians could individually make (or postpone) the choice to accept or reject 
the Axiom of Choice. 

In the case where the partition P is finite, the validity of the principle AC can 
formally be derived by induction (on the size of P), but since nobody objects to 
making finitely many choices from finitely many sets, it is common to encounter 
informal proof-fragments such as 


... Since the sets A,, Ao,..., A, are nonempty, let us choose and fix elements a; € 
Aj, az € Ap, sees Ay € Aj. 


The use of AC therefore is necessary only when the partition P is infinite. For many 
standard results of mathematics, the full general form needs AC, while special 
“finite” cases can be proved without AC. For example the proof that an arbitrary 
vector space has a basis requires AC, but one can prove that every finite dimensional 
vector space has a basis without using AC. 


Effectiveness and Effectively Defined Choice Sets 


In some cases, even when the partition P is infinite one can (by exploiting some 
additional structure and properties of the underlying set or partition) explicitly state 
a rule which determines a unique member of £ for each F in P. This is expressed 
by saying that a member of E can be “effectively and uniquely determined” from 
E. In this case, a choice set C can be effectively specified or effectively defined from 
P, and the use of AC is not needed. Such a choice set C will be called an effective 
choice set. 


Example 288. If P is any partition of the set N of natural numbers (which is 
naturally well-ordered), then one can effectively define a choice set C by choosing 


'Note that if P is finite, the sets in P may be infinite. We should be careful to distinguish between 
“a partition being finite” and “the sets in the partition being finite.” For example, the partition of 
the set of integers into even and odd integers is a finite partition consisting of infinite sets, while 
the partition {{2n, 2n + 1}| n € Z} of the integers is an infinite partition consisting of finite sets. 
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the least element of each set of the partition P, i.e., by setting C = {min(£)| EF € 
P}. Notice how C is effectively specified from P. 


Thus every partition of the set of natural numbers has an effective choice set. 

The concept of effectiveness (or effective specification) will be encountered 
occasionally throughout this book. We will use it as an informal intuitive notion, 
without attempting to give a formal definition.7 Kuratowski [45, p. 254] explains 
that effectiveness concerns ways of proving existence theorems, i.e., theorems of 
the form “there exists x having property P.” Such a theorem is said to be proved 
effectively if one can explicitly define a specific object a and prove that a has 
property P. 

We have already used effective choice sets in proving that N = A if one is 
given a one-to-one reflection h: A > A anda € A~/A[A]. The sets A 2 A[A] 2 
h{h[A]] 2 --- then keep decreasing, producing pairwise disjoint sets A, = AN/A[A], 
A, = h[A]~NAIA[A]], etc. Since h(a) € Ao, h(h(a)) € Az, etc, so the elements 
a, h(a), h(h(a)),... form an effective choice set for the family {A, | n € N}, 
thereby effectively proving N = A. 


Problem 289. Let R/Z denote the partition of R consisting of all sets of the form 
a+ Zwith a any real, wherea+ Z:= {a+x| x € Z}. In other words, R/Z is the 
partition given by the equivalence relation ~z on R, where we definex ~z y @ 
x—y €Z, forx,y € R2 Find an effective choice set for this partition R/Z of R. 


Problem 290. Find effective choice sets for the partitions of the equivalence 
relations in Problems 46 and 49 of Chap. 1. 


Among the equivalence relations studied by the ancient Greek geometers (e.g., 
congruence and similarity mappings) was commensurability of length. Say that the 
positive reals x, y € R* are commensurable if x/y is rational. We can then ask: 
Can we define a choice set for the partition determined by the commensurability 
relation? 

Commensurability also has an essentially equivalent “additive” version where 
two reals in R are defined to be equivalent if they differ by a rational number. The 
question of defining a choice set for commensurability then becomes equivalent to 
the following problem. 


Problem 291. Let R/Q denote the partition of R consisting of all sets of the form 
a+Qwith a any real, wherea+Q:= {a+x| x € Q}. nother words, R/Q is the 
partition given by the equivalence relation ~g on R, where we define x ~g y @ 
x—y €Q, forx, y € R14 Can you define a choice set for this partition R/Q of R? 


Effectiveness is a metamathematical notion, and degrees of effectiveness (which depends on the 
complexity of the specification or rule) are studied in areas of mathematical logic such as recursion 
theory. 

3This is the coset decomposition of the additive group R modulo the subgroup Z. 

This is the coset decomposition of the additive group R modulo the subgroup Q, and is sometimes 
called the Vitali partition. 
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It is instructive for the reader to try to define a choice set for R/Q, but there is no 
reason to get discouraged if such a choice set seems too elusive to define. From the 
work of Feferman, it is now known that the existence of a choice set for the partition 
R/Q cannot be proved without appealing to the Axiom of Choice, and even if the 
use of AC is allowed, no effectively defined set can be proved (without additional 
axioms) to be a choice set for R/Q. 

Thus for some partitions effective choice sets can be defined without using AC, 
but there are partitions which have no effective choice sets, making the use of AC 
essential to obtain choice sets in such partitions. 

This is illustrated by Russell’s example of the millionaire who bought Xo pairs 
of socks and Xo pairs of boots. The question is: Can we make a selection of socks 
with exactly one sock from each pair, and similarly for boots? Russell, who called 
the Axiom of Choice the multiplicative axiom, wrote: 


[I]t can be done with the boots, but not with the socks ... The reason for the difference is 
this: Among boots we can distinguish right and left, and therefore we can make a selection 
of one out of each pair, namely, we can choose all the right boots or all the left boots; but 
with socks no such principle of selection suggests itself, and we cannot be sure, unless we 
assume the multiplicative axiom, that there is any class consisting of one sock out of each 
pair. ...[W]ith the socks we shall have to choose arbitrarily, with each pair, which to put 
first; and an infinite number of arbitrary choices is an impossibility. Unless we can find a 
rule for selecting, i.e., a relation which is a selector, we do not know that a selection is even 
theoretically possible. [68, p. 126] 


Thus AC may be needed even when every set in the partition is finite (or even of 
cardinality 2), unless some additional structure can be exploited. 


Problem 292. Let C be a collection of pairwise disjoint nonempty finite sets of 
complex numbers. Show that C has an effective choice set. 


Problem 293. For an equivalence relation ~ on a set A, a function F: A — A is 
called a selector for ~ if F is a complete invariant for ~ (i.e.,x ~ y = F(x) = 
F(y)) and F(x) ~ x forall x € A. Prove that 


1. AC holds if and only if every equivalence relation has a selector. 
2. An equivalence relation has an effective selector if and only if the corresponding 
partition has an effective choice set. 


Problem 294. Show that AC is equivalent to the following statement: If F: X — Y 
is surjective then there is G: Y — X such that the function F oG:Y — Y is the 
identity mapping on Y. 


If R is a relation we say that F is a uniformization of Rif F C R, F is a function, 
and dom(F’) = dom(R). 


Problem 295. Show that AC is equivalent to the statement every relation has a 
uniformization. 
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AC via Choice Functions 


The form of AC we have been using so far is called the partition version of AC, 
where the sets from which elements are chosen are required to be pairwise disjoint. 
It is generally more useful to use a formulation of AC where the sets from which 
elements are to be chosen are not required to be disjoint. 


Definition 296 (Choice Functions). If C is any family of nonempty sets (i.e., if 
E€C= E # ®), thena choice function for C is any function F: C > UC such 
that F(E) € E forall E €C. 


AC1 (Choice Function Version of AC). Every family of nonempty sets has a 
choice function. 


Problem 297. The principle ACI is equivalent to the principle AC. 


[Hint: If Y is any collection of nonempty sets, then {{E} x E | E € Y} is a closely 
related collection of pairwise disjoint sets.] 


A special case of AC1 is obtained by taking C = P*(A), where A is any set and 
P*(A) := P(A)~{@} denotes the collection of all nonempty subsets of A. In this 
case a choice function F: P*(A) > A is also called a choice function for the set 
A. Note that this special case is actually equivalent to AC1, as we can restrict any 
choice function for the set A := UC to the subfamily C. Thus AC1 is sometimes 
expressed by saying every set has a choice function. 

ACI can also be restated in terms of indexed families of sets as follows. 


AC1 (Indexed Family Version). If (A;| i € 7) is an indexed family of sets with 
A; # @ for alli € J, then there is “choice function” g:  — U; A; such that 
y(i) € A; foralli € J. 


Problem 298. The indexed family version above is also equivalent to AC. 


[Hint: If (A; | i € 7) is an indexed family of nonempty sets then {{i} x A; | i € I} is 
partition, and any choice set for this partition is a function which satisfies the needed 
condition. | 


5.6 No and Countable Sets 


Theorem 299. If A C Nand A is infinite then AWN. 


Proof. Using the well-ordering property of N, define f:N — N by recursion as 
follows: Let f(1) be the least element of A and f(n + 1) be the least element 
of AX~{ fC), f(2),..., f(a)}. The recursion proceeds without halting since A is 
infinite. It is then easily verified that f is a bijection. 
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For a more formal proof, let a be the least element of A, let h: N — N be the 
function defined by h(n) := the least element of A greater than n, and apply the 
basic principle of recursive definition (Theorem 146). oO 


Corollary 300. For a cardinal k, k < Xo if and only if k is finite or K = Xo. 


Definition 301 (Denumerable and Countable Sets). A is denumerable if A ~ N, 
ie., if |A] = No. A set is countable if it is denumerable or finite. 


Corollary 302. A is countable if and only if A < N if and only if |A| < &o. Thus a 
subset of a countable set is countable. 


Corollary 303. Any infinite subset of a denumerable set must have cardinality Xo 
and so must itself be denumerable. 


Effectiveness for equinumerosity and cardinal equality. The notion of effective- 
ness plays an important role in similarity (equinumerosity) of sets and equalities 
between cardinals. We say that a set A is effectively equinumerous or effectively 
similar to a set B if one can effectively define a bijection between A and B. Two 
cardinals « and p are said to be effectively equal, expressed by saying that “k = 
effectively” if some set of cardinality « is effectively equinumerous to some set of 
cardinality jw. In particular, a set A is said to be effectively denumerable if there is 
an effectively defined bijection between A and N. 

The map n +> n+1 establishes a bijection between J and N so J ~ N effectively. 
But J is the disjoint union of N and the singleton {0}, so %p + 1 = No effectively. 
It follows by induction that 


Problem 304. No + 71 = Xo effectively, for alin € J. 


Problem 305. Prove that Xo + &o = No effectively, and so 2%) = Xo effectively. 
Prove that n&o = Xo, for alln ¢€ J. 


Problem 306. Prove that the range of any function with countable domain is 
countable. Prove that a nonempty set is countable if and only if it is the range of 
a function with domain N. 


Terminology overview. Recall that an infinite sequence is a function with domain 
N. The terms of an infinite sequence are the elements of its range. The terms are 
said to be arranged without repetition if the sequence is one-to-one; otherwise we 
say that the sequence has repeated terms. 


Definition 301 and Problem 306 can then be restated as follows. 
A set is denumerable if and only if its elements can be arranged in an infinite 


sequence without repetition. A nonempty set is countable if and only if its 
members can be arranged in an infinite sequence, with repetitions allowed. 


Definition 307. An enumeration of a nonempty countable set A is a sequence 
whose range equals A. 
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If a:N —> A is an enumeration of A (i.e., ran(a) = A), we informally express the 
fact by putting a, := a(n) and writing 


A = {41,@,...,4y,..-}, 
or by saying that “A is enumerated as a1, d2,...,@y,....” 

Examples of denumerable sets are readily obtained. Any infinite subset of N is 
denumerable. Moreover the set Z of all integers—positive, negative, or zero—is also 
effectively denumerable (why?). 

A more interesting example of a denumerable set is the following. 


Theorem 308 (Cantor). The set of ratios is effectively denumerable. 


Proof. Let the rank of ratio be defined as the sum of the numerator and denominator 
when it is expressed in lowest terms (reduced form). The smallest possible rank is 
2, and it is easy to show that there are at most n — | ratios of rank n. Now arrange 
the ratios not by their order of magnitude, but so that the ratios with a smaller rank 
come before the ones with a larger rank, and if two ratios have the same rank, then 
put the one with a smaller numerator before the one with a larger numerator. This 
arranges the ratios in the sequence 


2 3 4 5 6 7 
0 
1 12 1 3 1 2 3 4 15 12 3 4 5 6 
Be ABT ae BE a SS AP Ge me ABS Be 
where the number above the brace in a group is the rank for that group. Oo 


Remark. The function g defined as y(n) := the number of reduced ratios of rank n 
is important in number-theory and is known as Euler’s g-function. 


Problem 309. Show that the set Q of all rational numbers, positive, negative, or 
zero, is effectively denumerable. In particular, |Q| = Xo. 


Effective Enumeration of N x N 


One can naturally arrange the members of N x N, the set of all pairs of natural 
numbers, into the following “infinite matrix,’ where the ordered pair (m,n) 
occupies the entry in row m column n: 


DALE G3 VAS es IN ove 
(DA AOD AASV UIA re fo 1D Bh cs 
GB... 43,3) B53) Bay Bin) ase 


’ 
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We can also arrange the natural numbers N into a similar infinite matrix as shown 
below, in which the top row consists of the odd natural numbers, and every other 
row is obtained by doubling the previous row: 


a a rr, | 
2 6 10 14 ... 2(2n-1) 


4 AD -20° 38 an 400 =) 


Notice that here the entry in row m column n is 2” !(2n — 1). 

Thus, by letting the ordered pair (m,n) in row m and column n of the first matrix 
correspond to the natural number 2””—!(2n — 1) occurring at the same position (row 
m and column 7) of the second matrix, we get an effective enumeration of N x N. 


Problem 310. The mapping (m,n) +> 2"~!(2n — 1) is an effective bijection from 
N x N ontoN. Thus N x N is effectively denumerable and XoXo = Xo. 


Problem 311. /f A and B are denumerable then so is A x B. If A and B are 
effectively denumerable then so is A x B. (And similarly with “denumerable” 
replaced by “countable.” ) 


We write NS is an abbreviation for NoXo, so that NS = No. We can inductively define 
Ni forn € N by letting 8) := No and nett := NONo- 

Problem 312. Show that &j = Xo for alln €N. 

We thus get many examples of countable sets. For example, the set of all points 
(x, y) in the Cartesian plane with rational coordinates (i.e., with both x, y € Q) is 


countable, and similarly for n-dimensional space for n € N. The set of all triples of 
natural numbers (or rational numbers) is countable. 


The next problem gives another effective enumeration of N x N. 


Problem 313. Consider the following arrangement of N x N 
(1,1), (1,2), (2,1), (1,3), (2,2), (3,1), (1,4), (2,3), (3,2), (4.1) ..-., 
in which (m,n) precedes (m',n’\ if eitherm+n<m'+n' orm+n=m' +n 


and m <m’'. Show that in the above enumeration the pair (m,n) comes at position 
S(m +n)(m +n—1)+™, and therefore the mapping 


Cas A ea en 


(m,n) be 5 ’ 


gives us another effective bijection from N x N ontoN. 
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We do not prefer any specific effective bijection from N x N onto N over any 
other, but instead record their useful properties in the following form, obtained by 
considering the inverse of any such effective bijection. 


Problem 314 (Effective Pairing Functions). There are effective “pairing func- 
tions” m:N x N > N (bijective) and 4, p:N — N (surjective) such that for all 
m,n,k EN: 


A(a(m,n)) =m, p(a(m,n))=n, and n(A(k), p(k)) =k. 


In particular, for all m,n € N there isk € N withm = X(k) andn = p(k). 


Problem 315. Let 2 be the pairing function of Problem 313 and 4, p be as in 
Problem 314. Show that 


1m<m' => x(m,n) < x(m',n) andn <n! > x(m,n) < x(m,n’). 
2. A(n) <nand p(n) <n forall n. 
3. For all k there are infinitely many n with X(n) = k (and similarly for p). 


Suppose that ( f;, | m © N) is a sequence of sequences, where each /,, is a function 
with domain N which enumerates the set A, := ran(f,,). Fix effective functions zr, 
A, and p as in Problem 314, and define a sequence g by setting 


ak) := fay (plk)) (k EN). 


Note that g effectively “combines” the sequence of sequences (fi, | m € N) into a 
single sequence in the following sense: 


1. We have fi,(1) = g(z(m,n)), and so each f, can be recovered from g as a 
“subsequence of g” given by the mapping n +> g(z(m.n)). 

2. We have ran(g) = U,, ran( fi), and so the set enumerated by g is the union of 
the sets enumerated by the functions fi,. 


We summarize this as follows. 


Proposition 316. A given sequence (f,|n¢€N) of enumerations of sets 
(A, |m EN) (with ran(f,) = Aj,) can be effectively combined into a single 
enumeration of the union A := U, Ay, of the sets. 


Countable Union of Countable Sets 


We want to use the last result to establish the useful fact that the union of a countable 
family of countable sets is countable. We can attempt to reason as follows. Let 
(A, | 2 € N) be a sequence of nonempty countable sets, and enumerate each set 
Am as 


Am = 1 Oin is Qi 3s.0325 Amgiy ase f (m EN). 
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Then the union U,,cnA,, can be enumerated as 


'o Am = {41 (k).p(k) | ke N}, 


meN 


and so UmenAm is countable. 

But this proof is not effective. It makes a subtle use of the Axiom of Choice, 
since (unlike Proposition 316) no sequence of enumerations of the sets A,, is given 
beforehand. Each set A, can be enumerated in many different ways and saying “A, 
is enumerated as A» = {dm.1,4m2,--+;4man,---}” involves implicitly choosing and 
fixing one such enumeration. Since we have an infinite sequence of sets, this results 
in an infinite number of choices, requiring AC.° 

The proof given below (for Proposition 317) makes this use of AC explicit. 
However, since we will be making “at most countably many” choices, the full 
general version of AC will not be used in the proof. Instead, the following special 
case of the Axiom of Choice, known as the Countable Axiom of Choice or CAC, 
will suffice. 


5.7 The Countable and Dependent Axioms of Choice 


CAC (The Countable Axiom of Choice). Every countable family of nonempty 
sets has a choice function: If J is countable and (A; | i € 7) is a family of sets with 
A; 4 @ for alli € J then there is a choice function g: ] — Uje; A; such that 
g(i) € A; for alli € J. 


We then have: 


Proposition 317 (CAC). A countable union of countable sets is countable: If I is 
countable and A; is countable for eachi € I, then LU A; is countable. 
ie! 

Proof. Without loss of generality, we may assume that J and the sets A;, for all 
i € J, are nonempty. Fix effective pairing functions A and p as in Problem 314, 
so that for all m,n there is k with m = A(k) andn = p(k). Since J is nonempty 
countable, it can be enumerated by some function h: N > J with ran(h) = J. For 
eachi € J, define 


E; := {f | f is an enumeration of A;} 


= {f| f:N— A; with ran(f) = A;}. 


5 AC is needed even if the sets A,, are all finite, as illustrated by Russell’s example: Given Xo pairs 
of socks and Xo pairs of boots, how many socks do we have in total, and how many boots? With 
boots, the answer is No, but the socks may form an infinite Dedekind finite set and the answer may 
be a non-reflexive cardinal. 
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Each E; is nonempty since A; is nonempty and countable. Hence by CAC, there is 
a choice function g with dom(gy) = J such that g(i) € E; for eachi € J. Thus 
y(i) is a function enumerating A; and let us abbreviate g(i) as ¢; (for eachi € J). 
Finally, define g by 


&(k) := dna (e(K)). 


Then it is routine to verify that g enumerates Uje, Aj. oO 


The reader may once again compare Proposition 316 with Proposition 317 and note 
how the former can be proved effectively while the latter requires the use of CAC. 


Problem 318. Prove that if X' is any nonempty countable alphabet (= set), then 
the set 3* of all words over S) (= finite sequences from X)) is countable. Moreover, 
if &’ is effectively countable, then so is &*. 


Problem 319 (CAC). Let Aj, A2,... be an infinite sequence of pairwise disjoint 
sets such that Ay, ~ A, for allm,n €N. Letk = |A\|. If 1 C Nand I is infinite, 
then 


|) An 


nel 


|) 4, 


neN 


= kK-N&o = 


The problem below is relevant to the proof of the Cantor—Bernstein theorem. 


Problem 320. Suppose that we have a one-to-one reflection f:C — C, and let A 
be a subset of C disjoint from f[C], that is, with A C C~ f[C]. Define the sets 
A,, Az, A3,... recursively as follows: 


A, := f [A] and Anti i= f [An]. 


1. Show that the sets A, Aj, Az,... are pairwise disjoint. 
2. Put A* := A, U Az U-++ = UnsiAn. Prove effectively that AU A* ~ A*. 


Theorem 321 (CAC). Every infinite set is Dedekind infinite. 


Proof. For eachn €N, the following set is nonempty since A is infinite: 
F,:={f| f:N— Aand | f[N]| = 7}. 


By CAC, select a sequence ( f, | n € N) with f, € F, foralln EN. 

As in Proposition 316, combine the f,,’s into a single g: N > A where g(n) := 
Jaa (p(1)). Then g[N] = U, f,[N] is infinite (since | f,[N]| = 7) and countable, so 
g[N] ~ N. Hence A is Dedekind infinite. 

(Or, directly define an injection h:N — A by recursion: A(1) := f,(1) 
and h(n + 1) := fn4i(k) where k is the least number such that f,4i(k) ¢ 
{h(1), A(2),...,4(1)}, which is well defined since | ran( f,+1)| =n + 1.) Oo 
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So under countable choice, the two notions of finite sets coincide: A set A is infinite 
(in either sense) if and only if N < A. But note that if A is Dedekind infinite, we get 
N < A effectively (why?), while for a general infinite set A, we get N = A only by 
appealing to choice (so non-effectively). 


Problem 322. Show without using any form of AC that if A is infinite then P(P(A)) 
is Dedekind infinite. 


It follows that A is finite if and only if P(P(A)) is Dedekind finite, giving a 
characterization of (inductive) finiteness in terms of Dedekind finiteness. 


The Axiom of Dependent Choice 


The Axiom of Dependent Choice (DC) allows one to make a sequence of choices 
where each choice may depend on the previous one. We will use it to derive later 
that an order which is not well-ordered (or a relation which is not well-founded) 
contains a strictly decreasing sequence of elements. 


DC (The Axiom of Dependent Choice). Let R be a relation on A such that for 
all x € A there is y € A with xRy, and let a € A. Then there is a sequence 
(a, |n €N) € AN such that a; = a anda, Ra,+, foralln EN. 


Proof (AC). Put P*(A) := P(A)\{@} and fix a choice function ¢: P*(A) > A such 
that @(E) € E forall E € P*(A). Define g:A > A by setting g(x) := ¢({y € 
A| xRy}). Then g is well defined by the given condition for the relation R. Hence 
by the principle of recursive definition there exists a function f:N — A such that 
fC) = aand f(n + 1) = g(f()) for alln EN. Finally, put a, := f(). Then 
a, = a, and for all n we have 


Gn+1 = 8(an) = O(Ly € A| anRy}) € ty € Al an RY}, 


hence a, Ray+, for all n. oO 
DC is weaker than the full Axiom of Choice, but it is stronger than CAC. 
Problem 323. Show (without using any form of AC) that DC implies CAC. 


Problem 324. Use DC to formalize the argument at the end of Sect. 5.4 and give a 
proof of Theorem 321 using DC instead of CAC. 
5.8 No < c: The Cardinality of the Continuum 


Definition 325. An interval in R is a subset of R having one of the forms: 


(a,b), (a,5], [a,b), [a,b]; (00, a), (a, 00), (—00, a], [a,00); or (—00, 00), 
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The first four forms above are called bounded intervals, the next four are called 
half-infinite intervals, and the last interval (—oo, 00) equals R itself. 

An interval is proper if it contains at least two points, while the empty set @ = 
(a, a) and singleton sets {a} = [a, a] are improper intervals. 


Problem 326. Prove that if a < b are real numbers, then the interval (a,b) is 
effectively equinumerous with (0, 1), the interval [a, b] is effectively equinumerous 
with [0,1], and each of the intervals (a,b]| and |a,b) is effectively equinumerous 
with (0, 1], all via suitable linear mappings. 


The figure below shows the geometric view of a one-to-one correspondence between 
the line segment AB and the line segment CD. 


More interestingly, we have the following result. 

Problem 327. (0, 1] ~ (0,1), [0,1] ~ [0, 1), and so [0, 1] ~ (0, 1), effectively. 
[Hint: For (0, 1] ~ (0,1), remove from (0, 1] the set {1, 7 ‘ i ...} and remove 
from (0, 1) the set {5, i z...-}. Note that if A B=@=ANCthnBrcs 
AUBNWAUC,] 


Corollary 328. Any two bounded proper intervals in R, whether open, half-open, 
or closed, are effectively equinumerous with each other. 


Problem 329. For any a,b € R, [a, oo) ~ [0, 00) ~ (—0on, Dd], effectively. 
[Hint: Use the maps x > x + a and x b b — x defined on [0, oo).] 
Problem 330. [0, 00) ~ (0, 00), effectively.. 

[Hint: Remove J and N from the intervals [0, 00) and (0, oo), respectively. ] 


Corollary 331. Any two half-infinite intervals, whether open or closed, are effec- 
tively equinumerous with each other. 


The following result now implies that any half-infinite interval is effectively 
equinumerous with any bounded interval. 


Problem 332. Show thatx RW y = _ 1 maps the interval (0, 00) bijectively 
x 


onto (0, 1). 
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The next figure geometrically illustrates how the ray OA = (0,00) gets mapped 
onto the line segment OB = (0, 1). 


From what we have obtained so far, we see that 
R = (—oo, 0) U {0} U (0, 00) ~ (—1, 0) U {0} U (0, 1) = (-1, 1), 
and so R = (—oo, ov) is effectively equinumerous with any bounded interval and 


so also with any half-infinite interval. We record this important result as 


Corollary 333. Any two proper intervals in R, any of which may be bounded, half- 
infinite, or R itself, are effectively equinumerous with each other. 


[Note: The above results all hold for any ordered field, not just R.] 


By the last result, all proper real intervals have the same cardinal number which 
is denoted by c, and is called the cardinality of the continuum. 


Definition 334. c¢ := |(0, 1]| = |[0, 1]| = |(, 1)| = |RI. 
Since N C R, it follows that 
Problem 335. No < c. 


Problem 336. Using the earlier results of this section establish each of the 
following results effectively: 


I. c+c=¢, i.e, 2¢ = ¢; and so by induction we also haven:¢ = ¢. 
2. No:c=c. 

3. ¢+1=c; and so by inductione +n = ¢. 

4. ¢+ No =¢ 


We now turn to the following remarkable result of Cantor, often expressed by the 
statement “R is uncountable.” Unlike the set of rational numbers, the reals numbers 
cannot be exhaustively listed as a sequence. 


Theorem 337 (Uncountability of R, Cantor). No proper interval is countable. 
Hence R is not countable, that is, R < N, and soN ~ R. 


Proof. Since all proper intervals are equinumerous to each other and to R, it suffices 
to show that [0, 1] is not countable, that is, [0, 1] cannot be the range of any function 
with domain N. We will show, following Cantor, that if f:N — R then its range 
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J [N] = ran(f) cannot include all of [0, 1], i-e., there is p € [0, 1] which is not in 
the range of f, and so, in particular, f certainly cannot be a bijection between N 


and [0, 1]. 
Given a bounded closed interval J = [a,b] with a < b, we trisect [a,b] and 
subdivide it into three equal subintervals each of length 0 := F len )= pa, 


so thata < a+ < a+ 20 < b. By removing the middle-third open interval, 

I = [a, ] splits into two disjoint closed subintervals, called the left-third and right- 

third subintervals, as 
LU] : 
RUZ] : 


[a,a+] = left-third of [a,b], and, 
[a + 2,6] = right-third of [a, b]. 


II 


The figure below illustrates how this is done. 


I 
LI) RUT] 
eS en 
a a+¢ a+2¢ b 
Trisecting the interval J =[a,b], where ¢ := Hat 


Now let f:N — R be an arbitrary mapping, and put a, = f(n) for eachn € N, 
which gives the sequence (a), d2,...,@,,...) enumerating the range of f. We will 
find an element p € [0,1] which is not in the range of f, showing that f is not 
onto. 

We say that a set A avoids the real x if and only if x ¢ A. The crucial fact used 
in this proof is that for any given interval I and any real x, one of the subintervals 
LU] or R[I] will avoid x. 

Let J = [0,1]. Then a; cannot be both in L[J] = [0, 4] and in R[Z] = [5,1]. 
We will take J; to be one of these two subintervals, making sure that 7; avoids a. 
To be definite, let 


LUI] ifa, ¢ LV], 
R{J] _ otherwise. 


1= 


The point is that 7; is a closed subinterval of J such that 7; avoids a,. 

We continue in this fashion in stages, choosing closed intervals J; D J, D---D 
I, such that [,, avoids a,. To be definite, we specify how to go from stage-n to 
stage-(n + 1). Given that J,, has been constructed in stage-n, put 


LT] if Qn41 ¢ LIT], 


Th+1 = 2 
R{I,] otherwise. 


Then /,,4; is a closed subinterval of J,, such that J,4, avoids dyj+1. 
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This gives us a sequence of nested and bounded closed intervals 


I, 2hD--DI,DIn412°::: with a, ¢ I, for all n, 


and so by the Nested Interval Property there is p € N°2 Jn. But p A a» for any 
n EN, since p € I, while a, ¢ I,. So p is not in the range of f. oO 


Corollary 338. Xo < c. 


The Cantor Machine 


Note the effective nature of Cantor’s proof: Given any sequence of real numbers 
(a,|n €N) € RN, the procedure in the above proof effectively and uniquely 
produces a point p € [0, 1] with p ¥ a, for all n. This effective mapping (a,)  p 
will be denoted by M, so that M: RN —> [0, 1] with M((a,)) ¢ {an | n € N} for all 
sequences (a,) € RN. The mapping M, which is pictured below, will be referred to 
as the Cantor Machine. 


(dn) —- M} = —>M((an)) € [0, 1]stan | m € N} 


Input sequence of reals Cantor Machine Output real 


Thus the phenomenon in Cantor’s proof is summarized as follows: Given a 
sequence of reals (a,) as input, the Cantor Machine M responds by producing 
an output real p = M((a,)) € [0,1] which differs from every term of the given 
sequence. We informally express this by saying that the point p is diagonalized 
away from the given list of reals d,d2,...,@n,.... 


5.9 CH: The Continuum Hypothesis 


We now have the following examples of distinct cardinal numbers: 
O0<1<2<-:-<n<n+Il1<---+:- < No <c. 


By Corollary 300, we know that the sequence of cardinals from 0 to 8p is complete 
in the sense that other than the finite cardinals and No there is no cardinal which can 
be “placed in between them.” The question now arises if the sequence of cardinals 
from 0 to ¢ displayed above is complete in the above sense, and it reduces to the 
question: 


Is there a cardinal k such that ®%) < k < ¢? 
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The Continuum Hypothesis, or CH, is the assertion that there is no such cardinal. 
CH is equivalent to the statement that for every subset A of R, either A is countable 
(ie., |A] < No) or A ~ R (ie., |A] = c). If CH is false, there would exist 
uncountable subsets of R not equinumerous to R. 

Cantor tried to decide if CH is true or not, but failed. Other mathematicians in 
early twentieth century also tried, but the question remained open. Much of research 
in set theory in the twentieth century was dominated by this question. We will return 
to the topic later. 


5.10 More Countable Sets and Enumerations 


Recall that J = {0,1,2,...} denotes the set of all finite (inductive) cardinals, N = 
{1,2,3,...} is the set of all natural numbers (nonzero finite cardinals), and Z the 
set of all integers (positive, negative, or zero). Given any set A, we let A* denote the 
set of all finite sequences from A. The members of A* are also called the strings or 
words over the alphabet A. 


Problem 339. Find an effective bijection between J and the collection of all finite 
subsets of J. Conclude that the collection of all finite subsets of N is effectively 
denumerable, and more generally that the collection of all finite subsets of an 
effectively denumerable set is effectively denumerable. 


[Hint: Given m € J, consider the set A,, of all k € J such that the bit at position 
k in the binary representation of m is 1, where the least significant bit is defined as 
position 0. In other words, Am = {k € J| |m/2*| is odd}, where |x| denotes the 
greatest integer not greater than x.] 


Problem 340. Show that the set N* of all finite sequences of natural numbers is 
effectively bijective with the collection of all finite subsets of N*. Conclude that N* 
is effectively denumerable. 


[Hint: Consider the mapping which sends a finite sequence (m1,12,...,m%) in N* 
to the finite set {m,, my +12, ...,my m2 +---+n,}.] 


Problem 341. Prove that the set of words over any nonempty effectively countable 
alphabet is effectively denumerable. In particular, the set of all computer programs 
in any programming language is effectively denumerable. 


Problem 342. Prove that the set Z|x] of all polynomials with integer coefficients is 
effectively countable. 


Definition 343. A real number is algebraic if it is a root of some nonzero polyno- 
mial with integer coefficients. Otherwise, it is transcendental. 


Problem 344. Every rational number is algebraic, but there are infinitely many 
algebraic numbers which are not rational. 
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Problem 345 (Cantor). The set of algebraic numbers is countable. 
Corollary 346 (Cantor). There exist transcendental numbers. 


Since every nonzero polynomial can have at most finitely many roots, one can 
establish Problem 345 using the result that a countable union of finite sets is 
countable (Proposition 317), but such a proof requires the use of CAC (the 
Countable Axiom of Choice) and so is not effective. However, it is in fact possible 
to prove that the set of algebraic numbers is effectively countable. 


Problem 347. Show that given an effectively countable family of finite subsets of 
R, their union is effectively countable. Conclude that the set of algebraic numbers 
is effectively countable. 


By the last problem, the algebraic numbers can be effectively enumerated in a 
specific sequence @1,d2,...,n,...- Since the Cantor Machine effectively “diag- 
onalizes out” a real p different from all the a,,’s, so Cantor’s method is able to 
effectively specify a particular transcendental number. 

Long before Cantor, Liouville had given examples of transcendental numbers 
which are even more effective. For example, he proved that the Liouville Constant 


CO 
1 1 1 1 1 
= ++ = 0.110001000000000000000001000 - - - 
x 10" 10° 102 106 7 10 * 


n=1 


whose decimal expansion has the digit one at position n! for every n with all other 
digits being zero, is a transcendental number. 

Liouville’s proof method was specific to number theory. Cantor’s new method of 
proof, on the other hand, is applicable to much wider contexts beyond the theory of 
algebraic and transcendental numbers. 


Problem 348. Show that the rules 


fQ:=1, fQn):= ftny)+1, and fQn4+1):=1/fQn) (neN) 


define a unique function f:N — Q* which is in fact a one-to-one correspondence 
between N and the set Q* of positive rational numbers. 


Chapter 6 
Cardinal Arithmetic and the Cantor Set 


Abstract We continue the basic theory of cardinals, covering the Cantor—Bernstein 
Theorem, arbitrary cardinal products and cardinal arithmetic, binary trees and the 
construction of the Cantor set, the identity 28» = ¢ and effective bijections between 
familiar sets of cardinality c, Cantor’s theorem and KG6nig’s inequality, and the 
behavior of «*° for various cardinals x. 


6.1 The Cantor—Bernstein Theorem 


The following basic result says that for cardinals w and f, if a < 6 and 8 <a then 
a = B. Among other things, it greatly facilitates cardinal arithmetic. 


Theorem 349 (Cantor—Bernstein). Jf C > B and f:C — B is a one-to-one 
function “reflecting” C into the subset f[C] of B so thatC > BD f[C], then 
BwC. 


Discussion and proof. Figure 6.1 shows the analogy with Royce’s illustration of 
map. 

Put A := C~B. The set C is then viewed as a “country” with “provinces” A 
and B, and f is viewed as a “mapping” in the sense of cartography: Country C has 
just two provinces A and B (Fig. 6.la), and a perfect map C, of Country C is made 
upon the surface of Province B, so that C; consists of amap A; of A anda map B, 
of B (Fig. 6.1b). Since the map is correct, B; must contain a “map of the map,” C», 
consisting of Az and B,; and By must contain a “map of the map of the map”; and 
so on (Fig. 6.1c). 

The one-to-one mapping f maps A ontoits map A; = f[A],so A ~ Aj, with A, 
disjoint from A (since A; C B). Similarly Ay = f [Aj] is similar (equinumerous) 
to both A and A,, and is disjoint from both of them. So the “iterated maps” of 
Province A, shown shaded in Fig. 6.1c as Aj, Az,..., form an infinite sequence of 
pairwise disjoint “copies” of Province A: 


An A, ~ Arn ARN? (all similar and pairwise disjoint). 
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Fig. 6.1 Royce’s illustration of map for the proof of the theorem. (a) Country C consists of two 
provinces: A (shown shaded) and B (shown unshaded); (b) A map C, of Country C is placed 
within Province B, so C; itself contains maps A; and B, for Provinces A and B; (c) The map C; 
in turn must contain a “map of the map” C2, consisting of Az and Bo, and so on 


Put A* := A; U Az U Aj U---. Now a key observation is that (A U A*) ~ A%*, 
since f “shifts” the pairwise disjoint sets A, A,, Ao,... (shaded in the figure) to the 
“next” sets A,, A2, A3,.... More precisely, f }4ua+: A U A* > A® is a bijection 
from A U A* onto A*, since f is injective and 


f[AU A*] = f[A]U f[A*] = f[A]U f [Ai U Az U- +] 
= A,U f[Ai] U f[A2] U--- 
= A,;UA2U A3U--- = A®*. 
Finally, let FE := C~(A U A’®), the entire remaining unshaded part in Fig. 6.1c. 


Then C = (A U A*) U E and B = A* U E. Since (A U A*) ~ A®* and since E is 
disjoint from both A U A®* and A%, it follows that: 


(AU A*)UE w~ A*UE, or: C~wB. Oo 


It should be noted in the above proof that the final one-to-one correspondence, 
say g, between C and B is described as the mapping on C which fixes every point 
outside A U A* and sends each point x in A U A* to f(x). Formally, the bijection 
g:C — B is the union of f restricted to A U A* with the identity map restricted to 
C (AU A*), that is: 


F(x) ifx € (AU A*) 


gx) = ; 
ifx € CN(AU A*) 


Therefore the Cantor—Bernstein theorem is an effective theorem: A bijection g:C > 
B can be effectively specified in terms of the given function and sets. 


The theorem is often stated in the following equivalent “symmetric” form. 


Theorem 350 (Cantor—Bernstein, Symmetric Version). /f A < BandB=< A 
then A ~ B. Therefore, the relation < defined on the cardinals is antisymmetric, 
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ie, ifa < Band B < a, or equivalently ifa =* B, thena = Bf. It follows that 
a < B ifand only ifa < B ora = B. 


Problem 351. Show that this symmetric version of the Cantor—-Bernstein Theorem 
is equivalent to the earlier version. 


Problem 352. The Cantor—Bernstein Theorem is equivalent to the assertion that if 
a, B, y are cardinals witha + B +y =a, thna+fBp=a. 


Historical note. The Cantor—Bernstein Theorem was conjectured by Cantor, par- 
tially proved by Schréder, and a full proof was published by Bernstein. But earlier 
than all of these, Dedekind had actually proved the theorem, but he never published 
his proof. The theorem is most often called the Schréder—Bernstein Theorem, and 
sometimes simply Bernstein’s Theorem. 


6.2. Arbitrary Sums and Products of Cardinals 


Given an indexed family (a;|i € 7) of cardinal numbers, we wish to define the 
“general sum” )7,<;, @; as follows. First use AC to choose representative sets A; of 
cardinality a; so that |A;| = a; for eachi € J. Then the sets {i} x A; are pairwise 
disjoint, with |{i} x A;| = a; (i € I). So we may define: 


Yio: = |) Gx 4 


ie] ie] 


For this definition to work properly, we need one more application of AC: 


Problem 353 (Uniqueness of General Sum, AC). Let (A;|i¢ J) and 
(B;|i € I) be indexed families of pairwise disjoint sets (i.e., for i,j € T, 
if j > A; NA; =@O= BN B;). If Aj ~ B; for eachi € I, then 


J 4i ~ UB. 


ie] ie] 


Let (a; | i € I) be an indexed family of cardinals. Without AC, we cannot guarantee 
the existence of a representative family of sets (A; | i € J) with |A;| = a; fori € J. 
Even if such a family exists, we cannot assume, without AC, that if |A;| = |A/| = a; 
for alli € J then Uje; {i} x Aj ~ Ujer {i} x Aj. 


Definition 354 (Sum-Adequate Families of Cardinals). An indexed family of 
cardinals (a; |i € 1) is sum-adequate if there is a representative family of sets 
(A; |i € I) such that |A;| = a; for alli € J and whenever A’ ~ A; for alli € J, 
we have Uje {i} x Aj ~ Ujer {i} x Aj. 
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Under AC, every family of cardinals is sum-adequate. If AC is not assumed, 
arbitrary cardinal sums are significant only for sum-adequate families. ! 


Definition 355 (General Cardinal Sum). For a sum-adequate family of cardinal 
numbers (a; |i € 7), we define the general cardinal sum as: 


Soa; ‘= le {i} x A; 


ie] ie] 


’ 


where each A; is a representative set of cardinality a@;,ie., |A;| = a; fori € J. If 
(a; | i € I) is not sum-adequate, we define the sum )>,.; a; to be zero. 


Problem 356 (AC). Assuming AC, show that the general cardinal sum of any 
indexed family of cardinals is well defined and unique: For any indexed family of 
cardinal numbers (a; |i € I), there is a unique cardinal number a and a pairwise 
disjoint family (A; |i € I) of sets such that 


|A;| =a; foralli € I, and a= 


U4 
Problem 357 (AC). /f|A| = a, |B| = 6, anda, =a for each b € B, then 


ap = Soap = ee 


beB beB 


[Hint: A x B partitions as Upeg A x {b} with A ~ A x {b} forall b € B.] 
Problem 358 (AC). Prove the distributive law for arbitrary cardinal sums: 
a( 74) =) of;. 
iel iel 


Definition 359. Let 7 be any set. We say that a is an I-tuple if a is a function 
with domain dom(a) = /. If a is an J-tuple, then the value a(i) is called the i-th 
coordinate of a and is denoted by a;, so that a = (a;|i € 1). 


Definition 360. The Cartesian product of an indexed family (A; | i € J) of sets is 
defined as the set of all J-tuples (a; | i ¢ 1) whose i-th coordinate ranges over the 
set A;, foreachi € J. In notation: 


‘As Russell’s socks-and-boots example noted, without AC, the family (2,2,2,...) may fail to be 
sum-adequate: The ambiguous infinite sum 2 + 2 + 2 +--+ may be No, non-reflexive, or > No. 
The failure for the sequence (No, No, No, ...) is striking too: R may be partitioned into countably 
many countable sets (Feferman—Levy, see [33]). If each a; is well-orderable, then one may use 
canonical representatives to define a “principal value” for }°; a; even if (a;) is not sum-adequate 
(Whitehead, see [48]). 
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[ [4: = {(ailie 2) Vi eT, a; € Aj}. 


ie] 


Problem 361 (AC). If A; ~ Aj for alli € I, then |] A; ~ | | 4’. 

ie] ie] 
Let («; | i € IZ) be an indexed family of cardinals. Without AC, if |A;| = |4}| = « 
for alli € J, we cannot conclude that [];<; Ai ~ [];<, Aj. In fact, without AC we 
cannot even conclude that a representative family of sets (A; | i € 7) with |A;| = k; 
exists. 


Definition 362 (Product-Adequate Families of Cardinals). An indexed family of 
cardinals (x; | i € I) is product-adequate if there is a representative family of sets 
(A; |i € I) such that |A;| = «; for alli € I and whenever A} ~ A; for alli € J, 
we have Hier Ay Hier Aj. 


If AC is not assumed, arbitrary cardinal products are significant only for product- 
adequate families. 


Definition 363 (Arbitrary Cardinal Products). For a product-adequate family 
(k; |i € I) of cardinals, define the cardinal product of the family as: 


[ [xi = [ [4 


ie] ie] 


’ 


where A; is a set with |A;| = x; for each i € J. If (k;|i € 7) is not product- 
adequate, we define the product [[,<; «; to be zero. 


Under AC, every family of cardinals is product-adequate, and every family of 
nonzero cardinals has a unique nonzero cardinal as the product. 


Alternative View of Products 


Suppose that (B; |i € /) is an indexed family of pairwise disjoint nonempty sets 
(i.e., B) A O and B; NM B; = OG wheneveri # /), so that {B; | i € 7} forms a 
partition. We would expect the number of choice sets from this partition {B; | i € 7} 
to be equal to the product of the size of the B;’s. The following result confirms this 
formally. 


Problem 364. Given an indexed family (A; |i € I) of nonempty sets, put By = 
{i} x A; so that each B; ~ Aj, but B;} 1B; =@ fori # j. Let C be the collection 
of all choice sets from the partition {B; | i € I}. Then C ~ [],¢, Ai. In fact, 


C= liex Aj. 
Thus the alternative definition of product (using choice sets from a partition) 


coincides with the original Cartesian product! 
Here is why Russell called the Axiom of Choice the Multiplicative Axiom. 
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Problem 365. The Axiom of Choice is equivalent to the assertion that an arbitrary 
product of nonzero cardinals is nonzero: If (k;|i € I) is an indexed family of 
cardinals with k; # O for alli € I, then |];<,; ki #0. 


6.3 Cardinal Exponentiation: | P(A)| = 2!4! 


If A; = A foralli € J, so that x; = |A;| = |A| = « (say) for all i, then the cardinal 
product [],<;, «; reduces to exponentiation, as in (informally): 


Aj| = Ki = Al = xc = [AHI = ltl, 
[Tia = []« = [[lal = T]« = bal 


ie] ie] ie] ie] 


Problem 366. [f A; = A for alli € I, then the Cartesian product |];<; Ai = 
Tle, A equals the collection of all functions from I to A, ie, [];e,4 = 
{F | F: B > A}. 


The last fact simplifies the definition of cardinal exponentiation. 


Definition 367 (Exponential Sets). Given sets A and B, we define A® to be the 
set of all functions from B to A, ie., 


A® :={F| F:B => A}. 


The following is similar to Problem 361, but here we do not need AC. 


Problem 368 (Invariance of Exponentiation). Jf A ~ C and B ~ D then 
AB ~C?. 


The following is therefore well defined: 


Definition 369 (Cardinal Exponentiation, Cantor). Given cardinals a and £, 
define 


a s= |A?| , Where A and B are representative sets with |A| = a, |B| = B. 


Problem 370. Let « be a cardinal. Using the last definition of exponentiation show 
that! = x, andx"*! = «"-« foranyn € J. Informally: kK" = k-K-+«K (n times). 


Definition 371 (Characteristic Functions). We say that f is a characteristic 
function on I if f € {0,1}, that is, if f: 1 — {0,1}. 

If E C J, we let yz denote the characteristic function on J defined by yz(i) = 1 
ifi ¢ Eandyr(i) = O0ifi Z E. 
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Definition 372 (Binary /-tuples). An /-tuple a = (a;|i € J) is called a binary 
I-tuple if and only if Vi € J, a; = 0 ora; = 1. Thus a binary /-tuple is simply a 
characteristic function on J. Given a binary J-tuple a = (a; |i € I), the value of a 
ati, ie., a(i) = aj, is called the i-th bit of a. 


Recall the definitions of infinite sequences and strings from Sect. 1.7. 


Definition 373 (Infinite Binary Sequences). When J = N, a binary N-tuple is 
called an infinite binary sequence, so that {0, 1}‘ is the set of all infinite binary 
sequences. An infinite binary sequence a = (a,|n €N) is written by simply 
writing out its bits in order, 1.e., as d = €1203°++Ay-::. 


For any set J, there is a very natural effective one-to-one correspondence between 
its power set P(1) and the set {0,1} of all characteristic functions on I, making 
the two collections P(1) and {0, 1}! virtually interchangeable. 


Problem 374 (Cantor). P(/) ~ {0, 1}, via an effective natural bijection. 
In particular, P(N) ~ {0, 1}§ via a natural effective bijection between the set of 
all subsets of N and the set of all infinite binary sequences. 


[Hint: Consider the mapping E +> y¢ from P(J) to {0, 1}/.] 
Corollary 375. If |A| = « then | P(A)| = 2". In other words, | P(A)| = 2!41. 


Corollary 376. If A is denumerable, then | P(A)| = 2®°, and thus every denumer- 
able set has exactly 28° subsets. In particular | P(N)| = | P(Q)| = 2®°, i.e., each of 
Nand Q has exactly 2®° subsets. 


Problem 377. A =< P(A) ~ {0, 1}4 for any set A, and k < 2 for any cardinal x. 
Problem 378. c < 2%». 
[Hint: Find an injection from R into P(Q).] 

Since we have earlier proved that Xo < ¢, we now get: 


Corollary 379. 8) < 28°. Thus there are uncountably many infinite binary 
sequences, and so uncountably many subsets of N. 


This last fact is a special case of Cantor’s Theorem, and we will revisit this in a later 
section where we will also prove the general case. 


6.4 Cardinal Arithmetic 


Since we have already defined sum, product, and power of cardinals, we can look 
for their algebraic properties. Throughout this section we assume AC. 


Definition 380 (Rearrangement or Permutation). If («;|i ¢ 7) and 
are two families of cardinals indexed by the same index set J, we say that 


Kliel 
Kliel 
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is a rearrangement (or permutation) of (k;| i € I) if there exists a permutation o of 
T (that is o: J + I a bijective transformation of J onto 7) such that 


kK; = Kot) for alli € I. 


Problem 381 (The Generalized Commutative Laws). If («/| i € I) is a rear- 
rangement of the cardinals (x; |i € I), then 


/ / 
y k= ) Ki, and [[¢=[]«- 
ie] ie€l ie€l ie] 


Problem 382. Formulate and prove “Generalized Associative Laws” for arbitrary 
sums and products of cardinal numbers. 


Problem 383 (Monotonicity of Sum and Product). Prove that if a; < B; for all 
i ET, then 


yore)” Bis and [Jo <[[é- 


ie] ie] ie] ie] 
Problem 384 (Laws of Exponents). Prove that if a, 6, and x are cardinals, then 
KOK = Orb a“ BX = (aB)*, and («*)P = «cP , 


Problem 385 (Generalized Laws of Exponents). 


a 
[ [« = riier%, and [| = (I1«) . 
ie] iel 
Problem 386. Are the following strict inequalities true? 


a<B>a% <p, and a<Bpok* <«?, 


Here are some examples of computation using cardinal arithmetic. 


Example 387. Find )}n =1+2+34---. 


neN 
Proof (Solution). Putk = Sn =1+2+4+3+---.Sincen > 1 foralln € N, we 
neN 
get 
k=) n=142434--23141414--5 1-No = Xo, 


neN 


6.5 The Binary Tree 117 


but also n < No for alla € N, and so 


k= Yon = 1H 2434--- <Not No +o $ 2+ = No No = Nj = No, 
neN 
Combining the inequalities k > Xo and « < No we get k = No. oO 
Example 388. Find [[ 1 = 1-2-3---. 
neN 
Proof (Solution). Putk = [[ n =1-2-3---=2-3-4---= [] (m+ 1). Since 
neN neN 


n = 2 foralln €N, we get 


K= n+ =2.3.-4...22.2-..2-.-= : 
[ J@+ 1 =2-3-4---2 2-2-2 on 


neN 


but also m < No for alln € N, and so 


K= [[- = 1-2636.. < O80. 980.980. (280% — 285 — 280, 
neN 
As x > 280 and « < 2®° we get « = 2%°, Oo 


6.5 The Binary Tree 


Recall that a binary word (or a finite binary sequence) is a finite word made only of 
0 and 1, and the set of all binary words is: 


{0,1}* := {e, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, ... }. 


The unique word of length zero is the empty word, denoted by ¢, and for any n, there 
are exactly 2” binary words of length n. 

We can arrange the binary words in a tree structure by regarding prefixes of words 
as “ancestors” and extensions as “descendents.” This will be called the tree of binary 
words, or simply the binary tree. The binary words are also called the nodes of the 
tree. For eachn, there are 2” binary words of length n forming the nodes of the tree at 
level n, and each node u of length n has two immediate descendents (extensions) of 
lengthn+ 1, namely u~0 and u~ 1. Since the empty word ¢ is a prefix of every word, 
it will be an ancestor of every word, and hence will form the root node of the tree. 
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A picture of the first few levels of the binary tree is shown below. 


0 1 
00 01 10 11 
000 001 010 oll 010 O11 010 O11 


Problem 389. Find an effective bijection between N and the set of nodes in the 
binary tree. Conclude that there are Xo nodes in the binary tree. 


Starting from the root node ¢, one can descend down the tree to obtain an 
infinite branch through the binary tree, that is an infinite set of nodes of the form 
{Ug, Uj, U2,...,Un,...}, Where ug = € and each u,+, is one of the two possible 
immediate extensions of u,, so that u,+, equals either uO or u1, and we get 
len(u,,) = n by induction. 

There are an infinite (in fact, uncountable) number of infinite branches through 
this tree, where each branch is represented uniquely by an infinite binary sequence: 
For any given infinite binary sequence x € {0,1}, the set {x|n | n = 0,1,2,...3 
of all prefixes of x forms a branch through the tree. E.g., the infinite binary 
sequence 000000--- represents the leftmost branch, 111111--- the rightmost 
branch, and the sequence 010101--- represents a “left-right-left-right----” zigzag 
branch: ¢,0,01,010,0101,.... 


Problem 390. An infinite branch B through the binary tree is a set of binary words 
which contains exactly one node of lengthn for eachn = 0,1,2,... and is linearly 
ordered by the “prefix” relation (i.e., for any two nodes in B, one is a prefix of the 
other). Thus B = {uo, uy, Uz,...,Un,...} where each uy, has length n and un+\ 
extends u, by postfixing a single bit to it. Let B be the set of all branches through 
the binary tree. Prove that B ~ {0, 1}, via a very effective natural correspondence. 


Problem 391. Prove that there are 2®° branches through the binary tree. 


Almost Disjoint Families of Subsets of N 


Definition 392. A family D C P(N) of sets of natural numbers is said to be an 
almost disjoint family if every set in D is infinite and the intersection of any two 
distinct sets in D is finite. 
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While any pairwise disjoint family of subsets of N is effectively countable (why?), 
there are almost disjoint families which are uncountable (of size = c). 


Problem 393. Show that there is an almost disjoint family D of subsets of N with 
[Dl =2"" 2 «: 


[Hint: Replace N by {0,1}* and note that the intersection of two distinct infinite 
branches (as defined in Problem 390) through the binary tree is finite. ] 


6.6 The Cantor Set K 


In this section we define the important Cantor set K, a set of reals which naturally 
has cardinality 2*°. The Cantor set is constructed using a very special “binary tree 
of intervals” called the Cantor system of intervals. 

More precisely, we will map the nodes of the tree of binary words into a 
collection of intervals, and assign a closed interval [[u] to every binary word u 
in such a way that? 

For every binary word u, the two intervals I[u~0| and I[u~1] will be 
disjoint subintervals contained within I [ul]. 


The definition below proceeds by induction on node depth (= word length). 

Recall that any closed interval J = [a, b] (witha < b) can be trisected into three 
equal subintervals each of length £ = t(b —a),sothata<a+f<a+2l<b.If 
we remove the middle-third open interval (a + €,a + 2€) from J = [a, b], then J 
splits into two closed subintervals: [a, a + €] = the left-third of I, and [a +2, b] = 
the right-third of I. 


Definition 394 (The Cantor System of Intervals). Let /[e] = [0, 1], and having 
defined J [u] for all words of length n, define [|v] for words v of length n + 1 by the 
rule 


T[u~0] = Left-third of I [u], and Z[u™1] = Right-third of J [u]. 


Since each binary word v of length + 1 has one of the forms v = u~0 orv = u™1 
where u has length n, by induction this completely defines /[u] for all binary 
words u. 


For example, we have /[0] = left-third of [0, 1] = [0, ], and /[1] = right-third of 
[0, 1] = (3, 1]. Also 7 [00] = left-third of 7[0] = [0, $], and /[01] = right-third of 
1[0] = [Z, 4], Similarly, 7[10] = [3, 2], 7[11] = [§, 1], etc. 


*In fact, we have already seen a form of this construction in the previous chapter as part of the proof 
of Cantor’s theorem that R is uncountable. However, in that proof only a specific “infinite branch 
of intervals” was “‘diagonalized out” to produce a real number distinct from the given sequence of 
reals. Here we will deal with the full tree of intervals. 
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This gives the following “binary tree of intervals,” which we call the Cantor 
System of Intervals, or the Cantor Tree of Intervals. 


T[e] 
[0,1] 


If x € {0, 1} is an infinite binary sequence, say 
X = XXNXZ-+ + Xyoee, each x, equals 0 or 1. 


then let x]n := Xx1X2...X, denote the prefix word of x of length n. Thus x 
represents the infinite branch through the binary tree given by its prefix words 
x|O = e, x|1 = x1, x|2 = xy x2, x|3 = X1X2%3, etc. 

Now note that given any x € {0, 1}, the infinite branch of its prefix words x|0, 
x|1, x|2,..., x|n, ..., etc., also determines an infinite branch through the above 
Cantor Tree of Intervals, giving the corresponding nested sequence 

Te] 2 I [x|l] 2 I[x|2] 2 I[x]3] S -++ DB T[x|n] 2 - 
of closed intervals, where the n-th interval I[x|n] has length 1/3”. Therefore, by 
the Nested Interval Property, this nested sequence of intervals must contain a unique 
real number, which we denote by F(x). 

So each x € {0,1}, via the nested branch J[x|0] > J[x|1] > J[x|2] 2 -:- 

through the Cantor Tree of Intervals, determines the unique point F(x) in their 
intersection. We thus have a function F: {0,1}N —> [0,1] which maps the set of 
infinite binary sequences into the interval [0, 1]. Officially, F is defined by setting 
F(x) := The unique member of (°°, I [x|n]. 
Theorem 395. Let I[u], where u ranges over all binary words, be the Cantor 
System of Intervals, so that I[e] = [0,1], 7[0] = [0, I, Ti] = [5 1], etc. 
The Cantor System of Intervals then naturally determines a unique injective function 
F: {0, 1}N — [0, 1] such that for every x € {0, 1}, 


CO 
F(x) = the unique member of () T[x|n]. 


n=1 
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Problem 396. Prove Theorem 395 by showing that F is one-to-one. 


[Hint: Distinct branches through the Cantor Tree of Intervals determine distinct 
nested sequences of intervals which eventually become disjoint. ] 


Problem 397. For each of the following infinite binary sequences x, find the first 
four of the nested intervals determined by x, and then compute F(x). 1. x = 
000000--- 2.x = 111111--- 3.x =O10101--- 4x =101010--- 


[Hint: F(x) is a limit of the endpoints of the nested intervals [[x|n], n € N.] 


Definition 398 (The Cantor Set K). The Cantor Set K is defined as the subset of 
[0, 1] which equals the range of the function F of Theorem 395, i.e., K := ran(F) = 
{F(x)| x € {0, 1}§}. 

The bijection F:{0, 1% — K establishes a natural identification of infinite binary 


sequences with the points of the Cantor set K. 


Corollary 399. K ~ {0, 1}N, so |K| = 2*°; The Cantor Set contains exactly 2% 
elements. 


Corollary 400. 2*° < c. 


Since we have earlier established that ¢ < 2*°, an application of the Cantor— 
Bernstein Theorem yields the following important result. 


Corollary 401 (Cantor). 2*° = c. 
In the next section we will indicate how to build an explicit effective bijection 


between R and {0, 1}. 

Let K,, be the set formed by taking the union of the 2” intervals at level (depth) 
n of the Cantor Tree of Intervals. So Ko = [0,1], Ki = [0, 5] U [. 1], K. = 
[0, t] U (5. ‘| U (5, Z| U (8. 1], etc., and in general K,, is the union of the 2” intervals 
Iu] where u ranges over the 2” binary words of length n: 


K, := 'e {7 [u]| wis a binary word of length n}. 
Note also that the intervals of K,, 4, are obtained by removing the middle-third open 


intervals from each of the intervals of K,,, thus doubling the number of intervals as 
we go from K,, to K,,+,. The first few of the sets K,, are shown below. 


= 0 Tle] 1 
0 
- ro 2 ti 
1 se Coo? 
3 3 
1[00] 1[01] 1{10) T(1 
K, es es es -e 


The sets K,, 


The following problem is instructive. 
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Problem 402 (Alternative Characterization of the Cantor Set). Show that 
le ) 


K= (\K,. 


n=1 


The Cantor set is uncountable (with cardinality ¢ = 280 > Np), but note that each set 

K,, consists of 2” disjoint closed intervals each of length 1/3”, so the total length or 

measure of the set K,, is (2/3)". Since K C K,, for all n, and since lim (2/3)” = 0, 
noo 

this seems to indicate that the Cantor Set K has zero measure, a notion that will be 

officially defined in Chap. 15. 


Problem 403. Prove that [0, 1]~K, the complement of the Cantor Set within [0, 1], 
can be partitioned into a pairwise disjoint countable sequence of open intervals 
whose lengths add up to 1. 


Thus one can also view the Cantor set being constructed by removing middle-third 
open intervals in stages, where at each stage we have a finite disjoint union of closed 
intervals, and proceed to the next stage by removing the middle-thirds of all the 
closed intervals of the current stage—which splits every interval into two and in 
effect doubles the number of intervals. We start at stage 0 with the unit interval 
Kp = [0,1], and having the set K,, at stage n, remove the middle-thirds of each 
of the 2” disjoint closed intervals of K, to obtain the set K,4, of stage n + 1 
consisting of 2”*! disjoint closed intervals. Once all the middle-third open intervals 
are removed, what remains is the Cantor set. 

While this “top-down” definition of the Cantor set is often useful, we re- 
emphasize the original “bottom up” construction via the Cantor Tree of Intervals 
where the points of the Cantor set are represented uniquely by infinite binary 
sequences: Since each sequence of nested intervals (/,,) given by an infinite branch 
through the binary tree produces a unique real in its intersection 1,/,, we get a 
natural effective one-to-one correspondence between the infinite binary sequences 
in {0, 1}N and the points of the Cantor set K. 


Problem 404. For any infinite binary sequence x = X\X2X3...Xpy...; 


Co 
F(x) = 2 a > (where F is as in Theorem 395). 


n=1 


Problem 405. Does 1/4 € K? Does 1/e € K? 


Problem 406. Recall the Cantor Machine M:RN — [0,1] used in the proof of 
Cantor’s theorem, which satisfies M((ay)) € [0, 1]\{an | n € N} for all sequences 
(an) € RN. Show that 


1. The mapping M is highly non-injective in the sense that for any (dn) € RN the 
set {(Xn) € RN| M((xn)) = M((an))} has cardinality ¢ = 2®°, 

2. The set of all possible responses produced by the Cantor Machine M equals the 
Cantor set, that is, ran(M) = K. 
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Problem 407. Exhibit a natural effective bijection between the Cartesian product 
£0, 138 x £0, 1} and {0, 1}. 


({Hint: Intertwine two sequences x;xX2--- and y; y2--+ into one: X1; yj X2y2°--.] 


Problem 408. Let K x K be the “planar Cantor set.” Show that there is a natural 
effective bijection between K x K and K. 


6.7 The Identity 2®° = c 


The identity 28° = ¢ can be used to obtain the following results. 

Problem 409. Let C = R? be the complex plane. Prove that ¢* = ¢ andsoC ~ R. 
By induction, this can be generalized to any dimensions. 

Problem 410. c” = c, and so R" ~R, for anyn EN. 


Problem 411. Prove that any subset of an Euclidean space (= R" for some n) 
containing a line segment (and in particular any subset with nonempty interior) 
must be equinumerous with the entire space. 


Cantor’s proofs of these facts initially resulted in controversy, since they seemed to 
contradict the familiar principle of “invariance of dimensions” which says that there 
cannot be a continuous one-to-one correspondence between two Euclidean spaces 
of different dimensions. For example, Cantor’s result R* ~ R implies that one can 
represent the points of the Cartesian plane using a single real coordinate in a one- 
to-one fashion (as opposed to the usual form which uses a pair of real coordinates)! 
However, it soon became clear that the confusion resulted from a failure to recognize 
the requirement of continuity in invariance of dimensions. As the field of topology 
developed, it was firmly established that the bijections obtained by Cantor’s method, 
while being effective, cannot be continuous, and so the principle of invariance of 
dimensions remains intact. 


Definition 412. The mapping h: {0,1}N — [0,1] is defined by setting, for each 
x = (x, |n €N) € {0,1}, 


so that h(x) is the real number in [0,1] having an infinite binary representation 
O- XY X2NX3Z6 8 Xy_ eee 


Problem 413. The map h: {0, 1}N — [0, 1] is surjective but not injective. For which 
x € {0, 1}§ can you find y € {0,1}8, y 4 x, with h(y) = h(x)? 


124 6 Cardinal Arithmetic and the Cantor Set 


Definition 414. Let {0,1}%, be the set of infinite binary sequences which are not 
eventually zero, that is those which have infinitely many entries of 1: 


£0, 138) := {x € {0, 8] x() = 1 for infinitely many 7}. 
Problem 415. Show that the restriction of h to {0, 1}N, is injective function with 
range (0, 1]. Hence there is an effective bijection from {0, 1}, onto (0, 1]. 
Problem 416. Show that there is an effective bijection from {0, 1} onto {0, 1}. 


[Hint: Given x = (xn) € {0,13N, put h(x) = (1,1—x1,1—x%2,...) if x is 
eventually zero, put h(x) := (0,x1,x2,...) if x is eventually one, and h(x) := x 
otherwise. ] 


Combining the results of the last two problems, we get the following. 


Corollary 417. There is an explicit effective bijection from the set {0,1} of all 
infinite binary sequences onto the interval (0, 1]. 


One could also express this explicit effective bijection in the following alternative 
form. 


Problem 418. Define ®: {0,1}% > (0, 1] by 


1- Sh(x) if x is eventually zero, 
@(x) = Sh(x) if x is eventually one, 


x otherwise. 


Then ® is an effective bijection from {0, 1}N onto the interval (0, 1]. 


In the last chapter we had seen effective bijections between (0, 1] and (0, 1), and 
between (0, 1) and R. Therefore we have: 


Corollary 419. There is an explicit effective bijection from the set {0, 1}N onto R. 
Hence we get 2®° = ¢ in an especially direct and effective way. 


Problem 420. Find a natural effective bijection between the set NN of all infinite 
sequences of natural numbers and the set {0, 1}, of all infinite binary sequences 
which are not eventually zero. 


[Hint: Given an infinite sequence (1,12,...,",,...) of natural numbers, consider 
the infinite binary sequence which has a | at position 11, position n; + 2, position 
n, +2 +3, and so on, and has a 0 at every other place. ] 


Problem 421. Prove that xk ° = ¢ effectively, by exhibiting an explicit bijection H 
from NN onto (0, 1]. 


[Hint: Consider the mapping H: NN —> (0, 1] 
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1 1 1 
H((11,72,n3,...,Mk,-.-)) = 2m + Qni tng 5 Qni+n2+n3 + 


which maps NN bijectively onto (0, 1].] 
Problem 422. Show effectively that ({0, 1}8)* ~ {0, 13%. 


[Hint: Use the method of Proposition 316 in which a sequence of sequences is 
effectively combined into a single sequence. ] 


Corollary 423. We have c® = ¢ and RN ~ R effectively. In particular, the set of 
all real sequences is effectively equinumerous with the real line. 


The results established so far can be summarized as follows. 


Theorem 424. One can explicitly construct effective bijections between any two of 
the following sets: 


f0, 138, NN, R, R?, R", RX, (0,1), (0,1), [0.1], [0,1], fo, 1]%. 


6.8 Cantor’s Theorem: The Diagonal Method 


In this section we will generalize the inequality 8) < 2*° to arbitrary cardinals, 
an important result known as Cantor’s Theorem. An even more general (but non- 
effective) result called Kénig’s Inequality will also be proved. We start with a proof 
of %o < 2° in terms of binary sequences which readily generalizes to arbitrary 
cardinals. 


Problem 425 (Cantor Diagonalization of Binary Sequences). {0,1:!N < N. 
More specifically, given any sequence of infinite binary sequences, one can effec- 
tively find another infinite binary sequences different from all the given ones. 


{Hint: Write out the given sequence of infinite binary sequences as an infinite array 
(matrix) of bits whose first row is the first given sequence, the second row is the 
second given sequence, etc. Now let d, be the complement of the first bit of the 
first row, d2 be the complement of the second bit of the second row, etc. Notice that 
d\dy---d, +++ is the sequence obtained by taking the diagonal of the given array 
and then inverting every one of its bits. ] 


The method outlined above, known as Cantor diagonalization, thus gives us a 
direct proof that there are uncountably many infinite binary sequences. 


Corollary 426. N < {0, 1}§, and so Xo < 2®°. 
Since {0, 1}N ~ P(N), so we also get: 
Corollary 427. N < P(N), i.e., &> < | P(N)|. So P(N) is uncountable, i.e., N has 


an uncountable number of subsets. 
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Cantor diagonlization can be readily generalized to arbitrary cardinalities: 


Theorem 428 (Cantor’s Theorem). « < 2* for any cardinal k, i.e., the cardinal 
2* is strictly greater than x. It follows that P(A) < A and so A < P(A), for any set 
A, i.e., |A| < | P(A)| and so the number subsets of A is strictly greater than | A\. 


Problem 429. Prove Cantor’s Theorem. 
[Hint: Since P(A) ~ {0, 1}4 one can work with {0, 1}4 instead of P(A).] 


Problem 430. Give a direct effective proof of Cantor’s Theorem by showing that if 
F:A— P(A) then the “anti-diagonal set” 


Dr := {x € A|x ¢ F(x)} 


is not in the range of F, i.e., Dr € P(A)~xran(F), and so F is not onto. 
The following beautiful result of KO6nig generalizes Cantor’s Theorem. 


Theorem 431 (K6nig’s Inequality (AC)). Let K be any set and let ax and Bx be 
cardinals for eachk € K. 


If ax < Bx for allk € K, then Sax < [|] Bx. 


kek kek 


Proof. Assume a, < Px for each k € K, and leta := pen ax and B i= 
Tleex Be. We will be using the Axiom of Choice several times in this proof to 
choose certain sets and elements. For each k € K, fix a set By such that |B, | = Bx. 
By replacing each By, with {k} x B, if necessary, we can assume that the sets By, 
k € K, are pairwise disjoint. Also since a, < Bx, we can fix, foreach k € K,a 
subset A; © By, with |A;| = a, and an element db; € By~ Ax. Put A := Ux Ax and 
B := [rex Be, so that |A| = a and |B| = B. 

Define F:A — B by setting, for each x € A, F(x) = y, where y = 
(yx | k € K) is defined as: 


x ifxe Ag 
Ve = ot 
b, otherwise 


Then F is a one-to-one function, for if x 4 x’ arein A, y = F(x) and y’ = F(x’), 
then either there is k such that x, x’ € Ax, so that yy = x # x’ = yj, or else there 
are distinct k,k’ € K with x € Ax and x’ € Ax, giving yy = x A by = y;, hence 
y # y’ in both cases. It follows that F is injective, and soa < B. 

Finally, let G: A — B be an arbitrary function from A to B. We show that G 
cannot be surjective. For eachk € K, the set Dy := {1% (G(x))| x € Ax} is a subset 
of B; of cardinality at most a, (where 2;,: B — By, is the k-th coordinate projection 
function), and so we can fix d, € By~D;. Then the element d := (d,|k € K) € B 
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is not in the range of G, for if x € A, then x € A, forsome k € K, so m%(G(x)) € 
Dy while mx(d) = dy ¢ Dx, hence G(x) # d. Thus G is not surjective. Hence 
a<B. Oo 


Problem 432. Derive Cantor’s Theorem as a direct corollary of Kénig’s Inequality. 


6.9 The Cardinal f = 2° and Beyond 


By Cantor’s Theorem, there is no largest cardinal number. For any cardinal x, the 
cardinal 2* is still bigger. 


Definition 433. f = 2°. 


Thus we have: 0<1<2<-++-<Ry<c<f. 


Problem 434. Prove that ¢* = f. Prove also that f° = f = f°. 
Thus it follows that the set of all functions from R to R has cardinality f. 
Problem 435. Prove that 


1. The collection of all bounded real-valued Riemann Integrable functions defined 
on the unit interval [0, 1] has cardinality f. 

2. On the other hand, that the set of all continuous real-valued functions with 
domain R has cardinality c. 


(Hint: For the first result, use the fact that any bounded function defined on the unit 
interval which is constant on the complement of the Cantor set must be Riemann 
integrable. For the second result, note that two continuous functions which agree on 
all rational points must agree on all real numbers. ] 


Cantor’s theorem enables us to obtain larger and larger infinite cardinals in an 
endless fashion. Starting from Xo and repeatedly applying Cantor’s theorem we get: 


Sy 
Ree ON Oe at, 


We can then get a cardinal 6 larger than all the cardinals above by taking 6 to be 
the sum of all these cardinals: 


B= Ro +2424... 
Then we can start again from 6 and keep applying Cantor’s Theorem to get 
paver exe 


and so on. Iterating the process endlessly into the transfinite needs the notion of 
ordinals, which will be defined later. 
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We have already seen that 
cXo—¢ and fo =F. 
More generally, we have K®o = k if k is any of ¢, 2°, 9?" saree CCS 


Problem 436. Put k; := 2° and ky41 = 2" forn = 1,2,.... Show that for any 
n> 1, a — wee 


In view of these last facts, one may think that the cardinals such as ¢ = Qo f = 2°, 
and 2° are “too large” to be increased by raising to the power No, and one may 
conjecture that if « is a sufficiently large cardinal, say if « > 2%, then «® = x, 
But this is false, since the cardinal 6 mentioned above is larger than all of ¢, 2°, 2 ; 
etc., yet we have B® > B. 


Problem 437. Prove that B®° > B. 
[Hint: Use Konig’s inequality. ] 


Problem 438. Prove that for any cardinal k there are cardinals uy > k andv > k 
such that w®° = ys and v® > y, 


The phenomenon in the last problem can be illustrated in a more general way using 
the concept of cofinality. We will define cofinality in a later chapter where the 
“cofinality version” of K6nig’s result will be proved. 


6.10 Additional Problems 


Problem 439. Show that the set of all monotone real functions has cardinality c. 
(A function f:R — R is monotone if either f(x) < f(y) forall x < y or f(x) = 


f(y) forall x < y.) 


Problem 440. Given aset A CN, define a real number x 4 as 


we alk) 
x4 Te 
k=1 


where we write x4(n) := lifn € A and ya(n) = Oifn ¢ A (thus ya is the 
characteristic function of A). Prove that 


1 ANB=@=> xaugp =X4 + Xp, and 
2. xa is irrational if and only if A is infinite. 


Problem 441. Find a specific mapping 


F:R—- P(N) 


6.10 Additional Problems 129 


such that for all x, y, if x < y then F(x) & F(y) with F(y)~ F(x) infinite. 
(Hint: It may be easier to first obtain such a function F from R to P(Q).] 


Problem 442. Find a specific function g:R — R such thatx #4 y > g(x) - 
g(y) € RXQ). (This gives an effective one-to-one map from R into the partition 


R/Q) 


Problem 443. A nonempty subset S of the set Q of rational numbers is called a 
subring of Q if it is closed under addition, subtraction, and multiplication, that is, 
x,yeS>x+y,x—y,xy € S. How many subsets of Q are subrings? Describe 
the subrings of Q completely, and exhibit an effective one-to-one correspondence 
between the collection of all subrings of Q and a familiar set. 


Chapter 7 
Orders and Order Types 


Abstract This chapter introduces order isomorphisms and order types, as well as 
the basic operations of sums and product of order types. 


7.1 Orders, Terminology, and Notation 


Consider the natural numbers arranged in ascending order of magnitude: 
1; 2,3) 3c Ny B Ty, o43 


This arrangement is determined by the binary relation < (Jess than), and a smaller 
number is always placed to the left of any larger number: If m and n are two distinct 
numbers, then m precedes n in the above arrangement if and only if m < n. Here, 
the precedence relation < is a transitive relation on N with the additional trichotomy 
property that given any two distinct natural numbers exactly one of them precedes 
the other. 

Another familiar linear ordering is on the “points of the real number line”—the 
set of real numbers ordered by magnitude—where the precedence relation is again 
a transitive relation with the trichotomy property. 

Intuitively, by a linear ordering we mean a set whose members are viewed as 
points arranged in a single line by a transitive relation of precedence where for any 
two distinct points exactly one precedes the other. If P denotes this relation on (say) 
the set X of points, “x Py” stands for “x precedes y.” 


We will first review some basic definitions from Sect. 1.9. 


Definition 444. We say that P is an order on A, or that P orders A, or equivalently, 
that (A, P) is an order (or a total or linear order) if P is a transitive relation on the 
set A which satisfies trichotomy on A, i.e., for all x, y € A exactly one of the 
conditions 
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xPy, x=y, yPx 


holds. If A has more than one element, the order is called nontrivial. 
Recall also that a relation P is called connected on a set A if for all distinct 
x,y € Aat least one of xP y or yPx holds. 


Variants of the following basic problems were given in Chap. 1. 


Problem 445. Let P be a relation on a set A. If P is asymmetric on A then it is 
irreflexive on A, but the converse implication may fail. If P is transitive, then P is 
asymmetric on A if and only if it is irreflexive on A. 


Problem 446. Let P be a transitive relation on the set A. Then each of the 
following conditions is equivalent to the others: 


1. P orders A. 
2. P is asymmetric and connected on A. 
3. P is irreflexive and connected on A. 


Remark. If P is an order on A, then by trichotomy the Cartesian product A x A is 
partitioned into three pairwise disjoint sets: 


ty) xPy}, te y)|x=y}, and {(x, y) | yPx}. 
In particular for any a € A, the three sets 
{x € A| xPa}, {a}, and {x € A] aPx} 


are pairwise disjoint with union equal to A. 


Remark. We are defining order in the strict sense, i.e., as an irreflexive relation. One 
could also define an ordering relation as a reflexive relation without any essential 
changes. 


Terminology and Notation 


If P orders A, we will write x <p y to denote xPy. When there is no chance 
of confusion, we even drop the subscript P and simply write x < y for xPy,a 
notation that will be used routinely. We will further abuse terminology and usually 
say “A is an order” in place of “(A, <) is an order.” The informal phrase “A is an 
order” is really an abbreviation for “A is a set with an associated ordering which 
will be denoted by <.” 

Of course, if there are multiple orderings, say P and Q, on the same set A, then 
the notation x < y may be ambiguous and we may need to explicitly distinguish 
between x <p y and x <g y. Also, two orderings on two different sets X and Y 
will sometimes be denoted by <y and <y, respectively. 
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In addition, the usual notational enhancements for the symbol < will be used. 
For example, “x < y” stands for “x < yorx = y,’? “x > y” means “y < x,” 
“x < y < Zz’ is an abbreviation for “x < y and y < z,” etc. 


7.2 Some Basic Definitions: Suborders 


Definition 447. Let X be an order. 

Givena € X, the elements of the set {x € X | x < a} are called the predecessors 
of a, and the elements of the set {x € X | a < x} are called the successors of a. 

If x,y € X, x is an immediate predecessor of y in X, or equivalently y is an 
immediate successor of x in X, if x < y and there is no z € X withx <z< y. We 
also say that x and y are consecutive elements or immediate neighbors to mean that 
one of them is an immediate successor of the other. 


It is easily seen that each element has at most one immediate successor or 
immediate predecessor. 


Example 448. The sets N and R can each be equipped with the usual order of 
magnitude among the elements, but these are two separate orderings. In N, the set 
of predecessors of the element 4 is the finite set {1,2,3}, while in R the set of 
predecessors of 4 is the entire open interval (—oo, 4). The set of successors of 4 in 
N is the infinite but countable set {5, 6, 7, 8,...}, while in R the set of successors of 
4 is the uncountable open interval (4, 00). 

In N, 7 is an immediate successor of 6, so 6 and 7 are consecutive elements of 
N, while in R the same elements 6 and 7 are not consecutive. In fact, in N, every 
element has a (unique) immediate successor and every element other than | has an 
immediate predecessor (1 has no predecessor at all in N). On the other hand, in R, 
no element has an immediate successor or immediate predecessor, and so there are 
no consecutive elements in R. 


Definition 449. Let X be an order, A C X,andae X. 

a is a lower bound of A, written a < A,ifa < x forall x € A, anda is a first 
(or least) element of Aifa € A anda < A. 

Upper bounds and last or greatest elements are defined similarly. 

An element a € X is called an endpoint of the order X if a is either a first or a 
last element of the whole set X. An ordering which does not have either a first or 
last element is called an ordering without endpoints. 

The subset A is called bounded below if there is some a € X which is a lower 
bound of A. Similarly we define bounded above. A is bounded if it is both bounded 
above and bounded below. 


If A,B C X, we write A < B to mean (Vx € A)(Vy € B)(x < y). 


By trichotomy, a first (or last) element of a set, if it exists, is unique, and we write 
a = min A for “a is the first element of A,” anda = max A for “a is the last element 
of A” 
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Example 450. Let each of N and R be ordered as before (by usual order of 
magnitude). The ordering N has a first element, but no last element. R is an ordering 
without endpoints. If A = [0, 00) is the subset of R consisting of the nonnegative 
reals, then A has a least element, 0. Also A is bounded below in R but not bounded 
above. 


Problem 451. For each of the following, give an example of a set X and a 
nontrivial order on X satisfying the given condition. 


I. X is countable, has a first and a last element, but there are no consecutive 
elements in X, 

2. X has a last element but no first element. 

3. X is infinite, X has a first and a last element, and each element except the last 
has an immediate successor and each element except the first has an immediate 
predecessor. 

4. X has a unique element which has neither an immediate successor nor an imme- 
diate predecessor while every other element has both an immediate successor 
and an immediate predecessor. 


Problem 452. /f A is a nonempty finite subset of an order X, show that min A and 
max A both exist, that is A contains a least element and a greatest element. 


Suborders 


The sets N, Z, Q, and R, each ordered by the natural order of magnitude among 
its elements, are familiar examples of orders. One can obtain many more examples 
of orders either by rearranging the elements of the set (next section), or by passing 
to a subset and regarding the subset as a new order with the ordering on the subset 
inherited from the original order. 

More specifically, given an ordering X anda subset Y C X, Y becomes an order 
on its own right by restricting the order on X to the elements of Y. The resulting 
order on Y is said to be the suborder induced by (or the suborder inherited from) 
the ordering of X. 

For example, let X be the set of real numbers with the usual order and let Y C X 
be the subset Y := {4 | n € N}. Then the suborder Y has a first element (1/2), but 
Y does not have any last element. In Y, the elements 2/3 and 3/4 are consecutive, 
with 3/4 being the immediate successor of 2/3. 

Order properties or relations for points and subsets may not be preserved 
between the suborder and the original parent order. In N C R, the suborder on 
N inherited from the usual order of R is same as the usual order on N, but we had 
noted earlier that 6 and 7 are consecutive elements in the suborder N, while the 
parent order R has no consecutive elements at all. 

As another example, consider the interval [0,1) as a suborder of R, and let 
A:={—L | n © N}, so that A C [0,1) C R. Then A is bounded in the parent 


n+1 
order R, but it is not bounded in the suborder [0, 1). 
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Definition 453 (Intervals, segments, cofinal sets). Let X be an order. 
An open interval in X is any subset which can be expressed (for some a,b € X) 
in one of the four forms 


{x|x <a} or {x]a<x} or {x|a<x<b}_ ortheentire order X. 


A closed interval in X is a subset of X of the form {x | x < a} or {x| a < x} or 
{x | a < x < b} (for some a,b € X) or X or @. An interval is a subset which is 
either an open or a closed interval. 

A subset A is an initial segment of X if it is “closed under precedence,” that is, 
ifa € Aandx < a => x € A (any predecessor of any element of A is also in 
A). Final segments are similarly defined. A subset A is a segment in X if whenever 
x,y €Aandx <z< ythenze A. 

A subset A is cofinal in X if for all x € X there is some a € A with x < a. 
Coinitial subsets are similarly defined. 


Note that every initial (or final) segment is a segment, and that every interval is a 
segment, but in some orders there are segments which are not intervals. For example, 
the subset {x € Q| x? < 2} is a segment, but not an interval, in Q (with usual order). 

The complement of an initial segment is a final segment, and vice versa. For an 
order without a last element a subset is cofinal if and only if it is unbounded above. 
For an order with a last element a subset is cofinal if and only if it contains the last 
element. 


7.3. Isomorphisms, Similarity, and Rearrangements 


Let A = {1,3,5,7,...} be the set of odd positive integers and B = {2,4,6,8,...} 
be the set of even positive integers, with each of them ordered by the usual order of 
magnitude. Consider the correspondence between them as displayed below: 


A:1 <3 <5 <7 <-:- <2n—-1 <--- 


$+ + + Y t 


B:2<4<6<8<---< 2n <.:-- 


If f denotes this mapping from A to B so that f(n) = n + 1, then note that 
f:A — B not only is a bijection, but also preserves order in the sense that for 
allm,n € A,m < nin A if and only if f(m) < f(m) in B. Such an order- 
preserving bijection between two orders is called an order isomorphism, and two 
orders are said to be similar or order isomorphic if there is an order-preserving 
bijection between them. 


Definition 454. Let A and B be orders. A mapping f is an order isomorphism 
from A to B if f: A — B isa bijection which also preserves order, that is, for all 
x,y€A,x < yin Aifand only if f(x) < f(y) in B. 
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Two orders A and B are similar or order isomorphic, written A X B, if there is 
some order isomorphism between them. 


Problem 455. [f A and B are orders, then f:A — B is said to be strictly 
increasing if whenever x < y in A then f(x) < f(y) in B. Show that a mapping 
from one order to another is an order isomorphism if and only if it is strictly 
increasing and onto. 


Example 456, Let A := Nand B := {4 | 1 € N}, where both sets are ordered by 


the usual order of magnitude. Then A and B are isomorphic via the order-preserving 


correspondence n <> =47. 


In fact, we had already seen some examples of order isomorphisms in the chapter 
on cardinals, where we saw that there is a bijective order-preserving correspondence 
between any two proper closed intervals of the real line. Furthermore, the mapping 
X +> = was seen to be an order isomorphism between [0, oo) and [0, 1). 

When two orders X and Y are isomorphic they share all order properties which 
do not mention actual elements or subsets of the orders: X has a first element if and 
only if Y has a first element, X has consecutive elements if and only if Y does too, 
and so on. 

Even when specific points and subsets of the orders are mentioned, the isomor- 
phism function will preserve all order properties and relations between them so 
long as those points and subsets are replaced by their appropriate images under 
the function when moving between the two orders: If f: X¥ — Y is an order 
isomorphism between the orders X and Y,a € X,and A C X, then a is the 
first element of A in X if and only if f(a) is the first element of f[A] in Y, a is an 
upper bound of A in X if and only if f(a) is an upper bound of f[A] in Y, A is 
bounded above in X if and only if f[A] is bounded above in Y, and so on. 

Informally, two orders are isomorphic if one can be obtained from the other by 
renaming or replacing its points while preserving the order. 


Rearrangements 


Distinct orders on the same set will be called rearrangements. For example, the 
finite set {a,b,c}, where a,b,c are three distinct elements, can be ordered in six 
different ways: 


a<x<b<c; a<c<b; b<a<c; b<c<a; c<a<b; c<b<a, 


where each order is a rearrangement of the others. Here all six orders are similar. 


Problem 457. /f A is a finite set with n elements, then show that there are exactly 
n! distinct orders on A all of which are similar. 
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If we start with the usual order of magnitude on the infinite set N, 
1<2<3<4<5<6<7<8<-:--, 

we can rearrange it to obtain new orders, such as the order 
2<1<4<3<6<5<8<7<:-: 

which is distinct from the original order since in this new rearrangement of N the 

first element is 2 and the immediate successor of 1 is 4. However, note that this 

rearrangement is still similar to the usual order on N (under the bijection f defined 


by f(n) =n + lifn is odd and f(n) =n — 1 ifn is even). 
We can get other rearrangements of N like P, Q, or R below 


Pe 6 <8 <7<6<5<4<3<2<1 
Q: m7 <5<3<1<2<4<6<8<-:: 
R: 1<3<5<7<-ee ee <8<6<4< 2, 


none of which is isomorphic to any other or to the usual order on N: The order 
P is the reverse order of the usual order and has a last element but no first; it is 
isomorphic to the set of negative integers with the usual ordering. The ordering Q 
has neither a first nor a last element, and is isomorphic to the set Z of all integers 
with the usual ordering. The ordering R has both a first element and a last element. 
These differences in the presence of first and last these orders show that none of the 
orders P, Q, R is isomorphic to any other or to the usual order on N. 

The usual method for showing that two orders are not isomorphic is to find an 
order property which holds in one order but not the other. Here is one more example. 
Consider the rearrangement S' of N shown as 


S: 2<3<4<5<6<::-<1. 


This rearrangement S of N has both a first element (2) and a last element (1), making 
it an ordering having both endpoints. Therefore it is not similar to any of the orders 
above—except possibly R, which also has both endpoints. But note that in R the last 
element has an immediate predecessor, while in S the last element has no immediate 
predecessor, so S and R cannot be isomorphic. Thus S is not isomorphic to any of 
other order mentioned above. 


Problem 458. Show that the three rearrangements of N shown by 


3<4<5<6<7<8<9<10<.:---1<2 
1<3<5<7<--- <2<4<6<8<-:--, 


and 3<5<7<9<-:- <2<4<6<8<::: 1 
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are not isomorphic to each other or to any other rearrangement of N mentioned 
earlier in this section. 


7.4 Order Types and Operations 


Problem 459. Similarity of orders is an equivalence relation. 


We can therefore appeal to the principle of abstraction to fix a complete invariant 
“OrdTyp” for the equivalence relation of similarity between orders.! 


Definition 460 (Order Types, Cantor). For each order X, the order type of X is 
denoted by OrdTyp(X). It is a complete invariant for similarity of orders, so that 
for all orders X and Y, X ~ Y } OrdTyp(X) = OrdTyp(Y). We will sometimes 
write OrdTyp.(X) to make the ordering < explicit. 


Problem 461. Two finite orders are isomorphic if and only if they have the same 
number of elements. 


Thus for each finite cardinal number n, there is a unique order type for orders on 
n-element sets, and we denote this order type by n. 

We now introduce special notation for important basic order types. All orders in 
the definition below are assumed to be the usual order of magnitude. 


Definition 462 (Notation for Standard Order Types). 


. n:= OrdTyp({1 <2 <---<n}). 
. @ := OrdTyp(N). 

. €:= OrdTyp(Z). 

. 1 := OrdTyp(Q). 

. A := OrdTyp(R). 


nAbWN Re 


Definition 463. If < is an order on X, its reverse order *< is the order on the same 
set X defined by x *< y & y < x. When X is equipped with the reverse order *<, 
we will refer to it as *X. An order is symmetric if X = *X. 


Since X = X’ > *X = *X’, we can make the following definition. 


Definition 464. Given an order type a, its reverse order type *a is defined as the 
order type of the reverse order of any order of type a. 
An order type a@ is called symmetric if *a = a. 


Problem 465. Which of the order types n, w, €, n, and 4, are symmetric? 


‘Definition 1299 gives a formal definition of order types in ZF set theory. 
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Sum of Order Types 


Informally, to obtain the sum a + 6 of two order types a and f we take disjoint 
representative orders A and B of type a and f respectively, and then form a single 
order by “placing A before B.” More precisely, given order types @ and 6, one can 
construct an order X which can be partitioned into disjoint sets L and U such that 
the suborder L has order type a, the suborder U has order type £, and all elements of 
L precede all elements of U (so that L is an initial segment in X¥ whose complement 
is the final segment U): 


X=LUU, L<U, OrdTyp(L)=a, and OrdTyp(U) = B. 


(This is very much like a Dedekind partition except that here we are allowing L 
and U to be empty.) Such an order X consists of “an order of type a followed by 
an order of type 6,” and X is easily seen to be uniquely determined up to order 
isomorphism by only the order types a and f. 


Theorem 466. Given any pair of order types a and B, there is a unique order type 
y such that any order X of type y consists of an initial segment L having type a and 
a (complimentary) final segment U = X~L having type B. 


Proof. Uniqueness is routine. For the existence part, let a and 6 be given order 
types. Fix orders L and U with OrdTyp(L) = @ and OrdTyp(U) = 8. By replacing 
L with L’ := {0} x L and U with U’ := {1} x U (and transferring the orders on L 
and U to L’ and U’), we may assume that L NM U = @. Now put X¥ = L UU, and 
order X by the rule 


x<y  Ejitherx,y ¢ Landx <yinL, 
orx,y €U andx < yinU, 


orx € Landy €U. 


Then it is easy to see that X is a well-defined order satisfying the conditions of the 
theorem. Oo 


Definition 467 (Sum of two order types). If a and £ are any two order types, then 
a+ B denotes the order type y of the preceding theorem. In other words, w+ 6 is the 
unique order type of any order X which can be partitioned into an initial segment L 
of type w and a final segment U = X~L of type B. 
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Example 468. Take L = {1 <2 <---<n}andU = {n4+1<n+4+2<.---}inN 
with the usual order: 


1<2<:--<n <n+1l<n4+2<.::: 
—_—_——_—_—"" —_—_—_—_——OoOoOoOoOo 
L U 


Since L has order type n, U has order type w, and the entire order N has order type 
a, it follows that (for any n € N): 


n+o0=o0. 
In particular 
l+o=o. 


On the other hand, consider the rearrangement of N displayed by 
2<3<4<5<-:--<1l, 


which has order type w + 1 (take L = {2 <3 <4 <.---}andU = {1}). This 
ordering has a last element, and so is not similar to the usual order on N. Hence: 


l+o#A#o++l, 


which shows that the commutative law fails for addition of order types. 


On the other hand, the associative law holds, allowing us to write expressions such 
as a + 6 + y unambiguously without using parentheses. In particular, if the order 
A has order type a and A is partitioned into segments A;, A2,--- , A, with Ay < 
Ag <---< A, and a, = order type of Ay (kK = 1,2,...,n), then we have: 


aAa=a+A2.+°::+Qy. 


Problem 469. Show that *(a + B) = *B + *a. 


Problem 470. Verify which of the following equations involving order types are 
correct: 


lot*o=¢ S.A+A=A 
2.*o+o=¢ 6.nt+1+n=7 
3.*a+3+4+0=6¢ 7 n+n=N 
4A4A414+A=1 & €+H0=*o+6 


The cancellation laws for addition fail. For example, 
l+@=2+o, but 12. 


However, certain special forms of cancellation work. 
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Problem 471. Show that if n is a finite order type then n can be cancelled both 
from the left and from the right: 


n+ta=n+B >a=f8, and at+tn=B+n a= p. 


Show also that if m,n are finite order types and a is the order type of an order 
without a first element, then 


m+a=n+a > m=n. 


Problem 472. Show that the order type ¢ of the integers can be cancelled both from 
the left and from the right: 


$ét+a=F6C+8 >a=f, and a+6=f4+¢e a= Bp. 


Ordered Sum of a Family of Order Types 


We now define the ordered sum of any family of order types indexed by an 
ordered set. 


Definition 473 (Ordered sum of order-indexed order types (AC)). Let J be an 
order (which is to be the index set) and for eachi € J let a; be an order type. The 
ordered sum 


Pa 


ie] 


is defined as the order type of any order X which can be partitioned into pairwise 
disjoint segments X;,i € 7 such that the suborder X; has order type a; for each 
i € I and X; < X; in X wheneveri < j in J. 


The proof of existence and uniqueness of the ordered sum }°,., X; is similar to 
the previous proof, but the Axiom of Choice is used as needed to fix representative 
orders (or order isomorphisms). We outline the proof of existence. Given a family 
of order types (a; |i € 7), where J is an order, first fix (using AC) an order X; of 
type a; for eachi € J. By replacing X; by X} := {i} x X; (and transferring the 
order on X; to X/) we may assume that the orders X; (i € J) are pairwise disjoint. 
Put X := Uje, X;, and define an order on X by 


x<y <& either forsomei € /: x,y € X; andx < yin Xj, 


or, forsomei,j €/: i < j inJ,x € X;,andy € X;. 


Then X becomes an order satisfying the condition of the definition. Oo 
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Problem 474. Forn €N, let a, = 1+ *w ifn is odd and a, = w + 1 ifn is even. 
Show that 


Yoon = 14+ 6 + ho. 


neN 


For simple index sets 7, it may be possible to write the ordered sum )7,<, as an 
informal expanded notation. For example if J = N (with the usual order) and for 
eachn € N,a, = A+ 1, then 


So an =o +O $03 ++ 


neN 
H=A4+)4+0A4)D4+A4)+-:: 
= OrdTyp((0, 1]) + OrdTyp((1, 2]) + OrdTyp((2, 3]) + --- 
= OrdTyp((0, 1] U (1, 2] U (2,3) U---) = OrdTyp((0, o0)) = 4. 


Similarly we can write: 


14+1414+--=0 
14+24+3+---=@ 
+3424+1="o, 


etc. 

The above informal notation assumes a generalized version of the associative 
law where all groupings of the summands yield identical sums so long as the overall 
order of the summands is preserved. 


Problem 475. Formulate and prove the generalized associative law for ordered 
sums of families of order types. 


Lexicographic and Anti-lexicographic Ordering 


By lexicographic order we mean the “left-to-right dictionary order” or “ordering by 
first differences,’ where two words x and y of same length are compared by reading 
their letters from left to right until the first place where the words differ is located, 
and we declare x < y if the letter of x at that position alphabetically precedes the 
corresponding letter of y. Thus in lexicographic order, letters are more significant to 
the left. The anti-lexicographic order is the opposite “right-to-left dictionary order” 
(or “ordering by last differences”) where letters on the right are more significant. 
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Definition 476 (Lexicographic and Anti-lexicographic orders). Let A and B be 
orders. The lexicographic order on Ax B is defined by the rule (for (a, b) , (a’, b') € 
Ax B): 


(a,b) < (a’,b') @ a<a'inA or a=a'andb <b’inB. 


The anti-lexicographic order on A X B is defined by the rule (for (a, b) , (a’, b’) € 
Ax B): 


(a,b) < (a’,b') & b<b'inB or b=b' anda <a’ in A. 


Problem 477. Show that the anti-lexicographic order on A x B is similar to the 
lexicographic order on B x A. 


Product of Order Types 


It is easily verified that if A = A’ and B = B’ then under lexicographic orders, 
Ax B & A’ x B’. This allows us to define the product of two order types, but the 
standard convention is to define it with the order of the factors reversed as follows. 


Definition 478 (Product of two order types). If a and f are order types, then the 
product a is defined to be the order type of the Cartesian product B x A under the 
lexicographic order (or the order type of A x B under the anti-lexicographic order) 
where A and B are orders of type a and 6, respectively. Notice the reversal of the 
order of the factors. 


In particular, note that 


If A has type a and B has type B, then the lexicographic product A x B has 
type Ba, not ap. 


Product as Repeated Sum 


It is often more convenient to view the product defined above as a “repeated sum” 
of the first factor. The following useful result, whose proof is routine, allows us to 
view the product as a “repeated sum”: 


Theorem 479 (Product as Repeated Sum, AC). /f a and B are two order types, 
then their product aB equals the ordered sum of an indexed family of types where 
the index set has type B and where each summand is a: 


ap = ye Qi, where B has order type B and a; = a foralli € B. 
ieB 


This sum is independent of the choice of the representative order B. 
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Notice that order of the factors matters heavily in the product, and af is viewed as 
“a repeated 6 times.” 
Thus 2@ is “2 repeated w times” and so equals w: 


20 =2+4+2+4+2+--- =o, 
while w2 is “q@ repeated 2 times,” or: 
o2=o+o. 


But w + w 4 w because an order of type w + w contains elements with infinitely 
many predecessors, whereas any element in an order of type w has only finitely 
many predecessors. Thus 


o2 42a, 


and so multiplication of order types is not commutative. 
Also, both the right and left cancellation laws for multiplication fail, since 


GQ+AE=A4)E=A but 14+44A+4+1, 
and (1+A)1=(1+A)2 but 142. 
On the other hand, the associative law for multiplication and the left distributive law 
(but not the right) hold. 
Problem 480. Prove the associative law for multiplication of order types. 


Problem 481. Show that the left distributive law a(B + y) = aB + ay holds, but 
the right distributive law (a + B)y = ay + By fails. 


Problem 482. Show that *(aB) = *a*B. 


We write a” for aa, a? for waa, etc. We also define w® = 1 anda! =a. 

In particular, the lexicographic ordering on N x N has order type w”, but let us 
examine the following example of rearranging the natural numbers into an ordering 
of type w”. 


Example 483. Consider the following rearrangement of N in which we first put the 
odd numbers in increasing order of magnitude, followed by “the doubles of the odd 
numbers” (numbers divisible by 2 but not by 4), followed by the “doubles of the 
doubles” (numbers divisible by 4 but not by 8), and so on: 


1<3<5<-:-2<6<10<::-4<12<20<::-8<24<40<.:.-..-- 


This order starts with an order of type w, followed by another order of type w, and 
so on in an infinite sequence of successive orders each having type w. This ordering 
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thus has order type @@ = w. We can formally define the rearrangement displayed 
above in terms of the usual order by the rule that x precedes y in this order if and 
only if 


v(x) < v2(y) or v2(x) = v2(y) and x < y, 


where v(x) denotes the highest power of 2 that divides x (v2(x) = 0 if x is odd). 
Of course we could have used any other prime in place of 2 and obtained a different 
rearrangement of N of order type ”. 


Each segment of type in the above order can itself be rearranged into an order of 
type w* making the order type of the overall arrangement w*. Evidently, the process 
can be iterated to generate orders of type w*, w°, etc. 


Problem 484. Consider the order on N in which x precedes y if and only if either 


v2(x) < v2(y), Or v2(x) = v2(y) and v3(x) < v3(y), or v2(x) = v2(y), v3(x) = 
v3(y) andx < y. 


1. What is the order type of this order? 

2. What is the order type of the suborder consisting of all predecessors of the 
element 600 in this order? 

3. What is the order type of the suborder consisting of all successors of the element 
600 in this order? 


Problem 485. Find a rearrangement of N of order type w*. 
Find a rearrangement of N of order type )), 0" = @ + @? +++- +O" +++. 


The left distributive law can be used to simplify expressions involving powers. For 
example, @ + w* = w(1 + @) = ww = w”. More generally, any integral power of 
@ can “absorb” any lower power of w from the left, but not from the right: 


Problem 486. /f0 < m < n are nonnegative integers, then w” + w" = w" but 
wo" +o0" 40". 


Lexicographic Orders with Many Factors 


One can readily extend the definitions of lexicographic ordering to n factors (n € N) 
instead of just two factors. For example, the lexicographic order on N? = Nx NxN 
has order type w?. In general, given u = (uw), U2,...,Ux) and v = (vj, V2,..., Vg) in 
N*, we say that wu precedes v in the lexicographic order on Né if and only if u p<vy 
for the least index j with uj; 4 v;. 

It is also possible to extend this notion to infinitely many factors indexed by an 
index set provided that the index set is well-ordered, an order-property that will be 
studied in a later chapter. Here we introduce an important special case, namely the 
lexicographic order on the “power” AN, the set of sequences from the order A. 
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Definition 487. Let A be an order. For sequences a = (a,|n EN) and b = 
(b, |n € N) in AN, we say that a precedes b in the lexicographic order on AN if 
An < by, for the least n at which a, 4 b,. That is, 


a<b _ forsomen, a, < b, but ay = by forall k <n. 


Recall that the set 2N of binary sequences is equinumerous with the Cantor set via 
the bijection F given by 


oe) 


F(a) = > (a € 2%). 


n=1 
It turns out that with the lexicographic order on 2N, the bijection F is also order- 
preserving: 


Problem 488. Show that the map F above from 2N onto the Cantor set is an order 
isomorphism, and so the Cantor set (with the usual order) is similar to the set 2N of 
binary sequences ordered lexicographically. 


Regarding elements of 2N as binary expansions of reals in [0, 1) we get another 
order-preserving map, provided that we discard duplicate binary expansions. 


Problem 489. Let D be the suborder of 2N (ordered lexicographically) consisting 
of binary sequences with infinitely many zeros, that is 
D := {a € 2| a, = 0 for infinitely many n}. 
Show that D is similar to the real interval [0, 1) with the usual order. 
[Hint: Use the map ¢: D — [0, 1) defined by $(a) = Y°°°., an /2”.] 
Problem 490. Show that the Cantor set has a subset of order type i. 


Problem 491. Suppose that the set NN of all sequences of positive integers is 
ordered lexicographically, and let NN be the suborder of NN consisting of strictly 
increasing sequences. Show that, under the lexicographic ordering, 


1. NN is isomorphic to the suborder Ni. 
2. Each of NN and Ni has order type 1 + A. 


[Hint: For the first part, consider the mapping from NN to Ni given by 


(11,12,13,...) Ke (ny, ny +12, ny +2 +73, ...). For the second part, show 
that the mapping 
1 1 1 
(UG, Wa Sok) ae egg (nj <n <3 <---) 


is an order-reversing bijection from Ni onto (0, 1].] 
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The definitions of lexicographic and anti-lexicographic orderings given above 
can only compare words of the same length, and the question of how to compare 
two words of different length is left open. The problem below considers several 
possibilities for extending the definition of lexicographic order to the collection N* 
of finite sequences of positive integers of all possible lengths. 


Problem 492. Let N* be the set of all finite sequences (strings) of natural numbers. 
Try to find the order type for each of the following orders defined on N*, all of which 
extend the lexicographic orders on N‘, k = 1,2,.... 


1. The order on N* defined by the rule that u = (u\,U2,...,Um) precedes v = 
(V1, V2,-.-,¥n) in N* if and only if m < n or m = n and u precedes v 
lexicographically in N"" (= N"). 

2. The order on N* defined by the rule that u = (uy,U2,...,Um) precedes v = 
(v1, V2,---,¥n) in N* if and only if either u is a proper initial segment of v, or 
there isk < min(m,n) with u, < vy andu; = v; forall j <k. 

3. The order on N* defined by the rule that u = (uy, U2,...,Um) precedes v = 
(v1, V2,.--,¥n) in N* if and only if either v is a proper initial segment of u, or 
there isk < min(m,n) with u, < vy andu; = v; forall j <k. 


The order in the last part of the problem is known as the Kleene—Brouwer order and 
will reappear as Problem 550. 


Chapter 8 
Dense and Complete Orders 


Abstract This chapter introduces some basic topological notions in the context of 
orders, and then develops the theories of dense orders and complete orders in a gen- 
eral setting. We cover Cantor’s theorem on countable dense linear orders, Dedekind 
completeness and completions, order characterization of R, connectedness and the 
intermediate value theorems for linear continuums, and the perfect set theorem for 
complete orders. 


8.1 Limit Points, Derivatives, and Density 


Definition 493. Let X be an order, leta € X,andlet AC X. 

We say that a is an upper limit point of A in X if a is not the first element of X, 
and for every x < a there is some p € A such that x < p < a. Lower limit points 
of A in X are similarly defined. 

We say that a is a limit point of A in X if a is a lower or an upper limit point of 
A in X, and a is a two-sided limit point of A in X if a is both a lower and an upper 
limit point of A in X. 

If the parent order X is clear, we simply use the phrase “limit point of A” without 
the qualifier “in X.” 

The set of all limit points of A (in X) is called the derived set of A or the 
derivative of A, and will be denoted by D(A). 


Example 494. Let X be the order R, and let 


n 
A= {inn uy, 


Then | is an upper limit point of A in X, and A has no other limit points (upper or 
lower) in X. So D(A) = {1} in X. Thus in general the notions of limit point and 
derived set are to be understood relative to the parent order. 
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On the other hand, if we consider the suborder A as an order by itself, without 
bringing back the parent order X, then 2 will be an upper limit point of A in the 
order A, and so we will have D(A) = {2} in the order A. 


An order of type w, such as N, by itself has no limit points, upper or lower. The 
same applies to any order of type ¢. On the other hand, in an order of type 7 (such 
as Q) or type A (such as R) every point is both an upper and a lower limit point (of 
the entire order). 

An upper limit point cannot have an immediate predecessor. If X is an order and 
a € X is not the first element of X, then a is an upper limit point in X if and only 
if a has no immediate predecessor in X. 


Problem 495. With the lexicographic order, which points are the limit points in 
ZxN?InN x Z? 


Problem 496. Show that an order X is without endpoints and without any limit 
point (i.e., every element of X has both an immediate predecessor and an immediate 
successor) if and only if the order type of X has the form a for some order type a. 


Recall the order X from Example 483 with order type w7: 
1<3<5<:--2<6<10<--»-4<12<20<-:-8<24<40<.::--:- 


In X the elements 2,4,8,... are the upper limit points, but there are no lower limit 
points. If A denotes the subset of X consisting of the odd numbers, then 


D(A) = {2}, while D(X) = {2,4,8,16,...}. 


Example 497. We slightly modify the order of Example 483 by moving | to the 
last position to get an order of type w* + 1: 


3<5<7<---2<6<10<:--4<12<20<:--8<24<40<.--- 1. 


If Y denotes this order, note that in Y the elements 2, 4,8,..., as well as the last 
element | are all limit points. Hence: 


D(Y) ={2<4<8<-- I. 


If we regard the suborder D(Y ) as an order by itself (of order type w + 1), then note 
that the elements 2,4, 8,... are no longer limit points in D(Y), but 1 is still a limit 
point in D(Y). This is because in the original order Y, 1 was a limit point of limit 
points. Thus 


D(D(Y)) = th. 
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Problem 498. Let X be an order and let AC X. 

1. Show that if A has a limit point in X then A must be infinite. Hence D(A) = © 
if A is finite. 

2. If A, B © X then show that D(A U B) = D(A) U D(B). 

3. Show that 


D(D(A)) € D(A), 


that is, “a limit point of limit points of A is a limit point of A.” 


Thus the elements of D(A) are limit points of A, elements of D(D(A)) are 
limit points of limit points—or second order limit points of A, the elements of 
D(D(D(A))) are the third order limit points of A, and so on. 

Let us write D(A) := A, D(A) := D(A), D(A) := D(D(A)), etc., so 
that the elements of D“)(A) are the limit points of A of order k. 


Problem 499, Let X = Rand let 


1 
Ar= y+ 


5m man | EN 


What is the order type of the suborder A? Compute D“(A) for k = 1,2, 3. 


Problem 500. Give an example of a subset A of R such that D®)(A) 4 @ but 
D(A) =@. 


Problem 501. /f X is an order of type w" + 1 (where n € N), what are the order 
types of D(X) for various k € N? 


Consider an order of type 


(De + 0] +1 


neN 


@+1t+o74+14+---+o"+14---4+1 


II 


Ota? +e-to"F--41 = (ror) 
neN 


where the last element is a limit point of limit points of order k for arbitrarily large 
k EN, and is called a limit point of order w. In fact, we can define 


DOA = (\ DO), DOM A= DOA); ui. Hey 
keN 


and keep iterating the derivative operator D indefinitely without end—a process 
that leads to the notion of ordinal numbers that will be studied later. 
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Problem 502. For each k €N, put 


1 
= + 


Ag i= 7k 


peste | 1,M2,...,nmk EN?. 


Qk+n1 DkA+ni ++ 


In terms of fractional binary expansion, the set A, consists of binary fractions of the 
form 0-10*1, where “0*” stands for a string of zero or more Os. Thus Ay © G, 1). 
The set Az consists of binary fractions of the form 0.010*10*1, with Az © (;. 5), 


etc. Finally, put: 
A=|)Ax. 
k 


1. Show that each Ax, has infinitely many limit points of order < k, exactly one limit 
point of order k (namely, xr), but no limit points of order > k. 


2. Show that D(A) 4 ® but D@t+ (A) = ©. 


Dense Orders and Dense Subsets 


Definition 503 (Order Density: Dense Orders). A nontrivial order X is said to 
be a dense order or order-dense if for all x, y € X with x < y there is z € X such 
that x < z < y, that is, if X does not contain any consecutive elements. 


Q, R, and any nontrivial rational or real interval are familiar examples of dense 
orders. Finite orders and orders of type w or € are example of orders which are 
not order-dense. The following problem illustrates the relationship between order 
density and limit points. 


Problem 504. Let X be a nontrivial order. Show that 


1. Every element of X is an upper limit point if and only if X is order-dense and 
has no first element. 

2. Every element of X is a lower limit point if and only if X is order-dense and has 
no last element. 

3. Every element of X is a two-sided limit point (both an upper and a lower limit 
point) if and only if X is order-dense and without endpoints. 


Conclude that the following conditions are all equivalent: 


1. X is order-dense. 
2. Every element of X except the first element (if present) is an upper limit point. 
3. Every element of X except the last element (if present) is a lower limit point. 


If X has more than two points, the above conditions are also equivalent to: 


4. Every element of X except the endpoint(s) (if present) are two-sided limit points. 
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Definition 505 (Relative Density: Dense Subsets). Let X be an order and suppose 
that A C X. We say that A is dense in X or that A is a dense subset of X if every 
element of X ~ A is a limit point of A, that is, if ¥ ~A C D(A) (or equivalently, if 
X = AUD(A)). 


For example, Q is dense in X. This follows from the fact that between any two real 
numbers there is a rational number. 


Problem 506. Let X be a dense order and A C X. Then A is dense in X if and 
only if for all x, y € X with x < y thereisa € Awithx <a<y. 


Problem 507. Let A be a dense subset of a dense order X. Show that the suborder 
A as an order by itself is a dense order. 


Problem 508. Assume that each of the sets N, Z, Q, and R is ordered by the natural 
order of magnitude among its elements. 


. Give examples of two disjoint subsets of R both of which are dense in R. 

. Prove rigorously that Q is a dense subset of R, but that Z is not dense in R. 
. Which subsets of N are dense in N? 

. If X is an order of type € + 1+ ¢, which subsets of X are dense in X ? 

. If X is an order of type w* + 1, which subsets of X are dense in X ? 


mA KR WNN 


Problem 509. Recall that an open interval in an order X is a subset which has one 
of the following four forms: 


{xe X|a<x<b}, {xEeXxX|x<a}, {xeEX|x>a}, or X. 


Show that a subset A of X is dense in X if and only if A has nonempty intersection 
with every nonempty open interval of X. 


Problem 510. For A C Rif R~A is countable, show that A is dense in R. Conclude 
that the set of irrational numbers is dense in R. 


Problem 511. Let X be an order. Given A, B C X, we say that A is dense in B if 
BNA C D(A). Suppose that AC B CC C X. Show that if A is dense in B and 
B is dense in C then A is dense in C. 


A Note on Terminology 


The notion of dense order (order-density) should be carefully distinguished from 
the notion dense subset (relative density), as they are of totally different category: 
Order-density is a property of entire orders, while relative density is a property 
of subsets of orders. Thus saying “Y is dense” may be ambiguous. To avoid this 
ambiguity, we can explicitly indicate order-density by saying “Y is a dense order” 
(or “Y is order-dense”’), and explicitly indicate relative density by saying “Y is a 
dense subset” (or “Y is dense in its parent order’). 
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8.2 Continuums, Completeness, Sup, and Inf 


Recall Dedekind’s method, in which we partition or “cut” a given order into two 
nonempty pieces with one piece completely preceding the other: 


Definition 512. A Dedekind partition or Dedekind cut for an order X is a partition 
of X consisting two nonempty disjoint sets L and U such thatx ¢e Lye U > 
x < y,ie., every member of L precedes every member of U, as pictured below. 


i U 
—————$—$————— 


In other words, all elements of U are upper bounds for L, all elements of L are 
lower bounds for U, as wellasL 4@4U,LNU=@,andLUU =X. 

A Dedekind partition L, U in an order X can be classified as exactly one of the 
following four types: 


1. Jump: Both L has a largest element and U has a smallest element. 

2. Upper limit point cut: L does not have a largest element, but U has a smallest 
element which therefore is an upper limit point of L. 

3. Lower limit point cut: U does not have a smallest element, but L has a largest 
element which therefore is a lower limit point of U. 

4. Gap: Neither L has a largest element nor U has a smallest element. 


Note that an order is order-dense if and only if it has no jumps (i.e., if no Dedekind 
partition is a jump). 

In a cut of type (2) or (3), which we call a limit point cut or a boundary cut, the 
two halves of the partition are “connected together” in the sense that one of them 
contains a limit point of the other. 

On the other hand, a cut of type (1) or (4) (a jump or a gap) is a “separation” of 
the order into two “disconnected pieces” none of which contains a limit point of the 
other. A continuum is an order which does not admit any such disconnection, that is 
one which has no jump or gap. 


Definition 513 (Dedekind Continuity). A nontrivial order is said to be a (linear) 
continuum or Dedekind continuous if no Dedekind cut in it is a jump or a gap, or 
equivalently if every Dedekind cut in it is a limit point cut. 


A more general notion is order completeness (Dedekind completeness). 


Definition 514 (Dedekind Completeness). An order X is said to be complete or 
Dedekind complete if it has no gaps, i.e., if no Dedekind cut for it is a gap. 


So an order is a continuum if and only if it is both order-dense and complete. 

We had already seen that the real line R and all intervals in it are continuums 
(being both order-dense and complete). On the other hand neither the rationals Q 
nor the integers Z form a continuum: Q is order-dense but not complete, while Z is 
complete but not order dense. 
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Since each of the properties of being order-dense, being a continuum, and being 
complete is preserved under order isomorphisms, we can meaningfully speak of an 
order type being order-dense, or a continuum, or complete. Thus 7 is not complete, 
and neither w nor ¢ is dense, while A, 1 + A, A + 1, and 1+ A + 1 are continuums 
(corresponding to the various kinds of intervals). Later we will see several examples 
of continuums which are essentially different from these. 


Definition 515 (Supremum and Infimum). Let X be an order and let A C X. 

We say that a € X is a least upper bound of A or a supremum of A if a is the 
least element of the set of all upper bounds of A in X, that is, if a is an upper bound 
of A anda < b for every upper bound b of A. Greatest lower bounds or infimums 
are similarly defined. 


It is easily seen that a least upper bound or supremum of a set, if it exists, must be 
unique. Therefore we make the following definition. 


Definition 516 (sup A and inf A). Ifaset A has a supremum, we denote it by sup A, 
and thus the statement “a = sup A” stands for “a is the least upper bound of A.” 
Similarly the infimum of A is denoted by inf A. 


For a set with a largest element, the supremum coincides with the maximum. For 
a nonempty set without a maximum, the supremum, if it exists, is an upper bound 
which is also a limit point of the set. 


Problem 517. Let A be a nonempty subset of an order X and let a be an upper 
bound of A in X. Prove that a is the least upper bound of A if and only if either a is 
the maximum element of A, or a ¢ A and a is an upper limit point of A. 


Problem 518 (Completeness as the Least Upper Bound Property). Given an 
order X, prove that the following conditions are equivalent. 


I. X is complete. 

2. X has the \east-upper-bound property: Every nonempty subset of X which is 
bounded above has a supremum (least upper bound) 

3. X has the greatest lower bound property: Every nonempty subset of X which is 
bounded below has an infimum (greatest lower bound). 


Problem 519. An order is complete if and only if every segment is an interval. 


Problem 520. For each of the following orders, express its order type in terms of 
familiar order types, and determine if it is order dense, if it is complete, and if it is 
a continuum. 


I. N with usual order. 

. Zwith usual order. 

Q with usual order. 

. R with usual order. 

The real interval (0, 1] with usual order. 
The set {0} U {| n EN}. 


AWKRWN 
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7. The set{—1|n € N}U {0} U {4| n EN}. 

8. The set {—+|n € N}U {4|n EN}. 

9. The closed unit square [0, 1] x [0, 1], with lexicographic order. 
10. The half-open unit square |0, 1) x [0, 1), with lexicographic order. 
11. The subset of the plane [0, 1] x {0, 1}, with lexicographic order. 
12. N x Z@, with lexicographic order. 

13. ZXN, with lexicographic order. 


Problem 521. An order is called totally discrete if every Dedekind cut for it is 
a jump. 


1. An order is totally discrete if and only if it is complete and has no limit points. 

2. Give example of an order which has no limit points yet is not totally discrete. 

3. Give examples of three pairwise non-isomorphic infinite totally discrete 
orderings. 

4. Prove that there does not exist four pairwise non-isomorphic infinite totally 
discrete orderings. 

5. List the possible order types of totally discrete orders. 


8.3. Embeddings and Continuity 


Definition 522. Let X and Y be orders. An order isomorphism between X and a 
suborder of Y is called an order embedding. If f: X — Y is an embedding, we 
say that f embeds X into Y. We say that X is embeddable in Y if there is some 
embedding of X into Y. 


If f embeds X into Y, then the suborder f[X] = ran(f) of Y is an “isomorphic 
copy” of X sitting inside Y. 
Any strictly increasing map from one order into another is an embedding: 


Problem 523. Let X and Y be orders and let f: X — Y be a function which is 
strictly increasing: If x < y in X then f(x) < f(y) in Y. Then f is an embedding 
of X into Y, and so X is isomorphic to the suborder f |X] = ran(f) of Y via f. 


This makes it easy to find examples of embeddings. For example, the map n +> n? 


is an embedding of N into itself, and n t n/(n + 1) is an embedding of N into R. 
If X is a suborder of Y, then the identity map on X is an embedding of X into 
Y, called the inclusion map. Thus every suborder is embedded in the parent order 
via the inclusion map. 
The analogue of the Cantor—Bernstein Theorem fails for orders: 


Problem 524. Give examples of two non-isomorphic orders each of which is 
isomorphic to a suborder of the other. 


Unlike order isomorphisms, order embeddings fail to preserve many order 
notions. In particular, limit points of subsets are not preserved. Let 
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n 

x= | nent uti, 
n+1 

so that X has order type w+ 1 and the point 1 is a limit pointin X. Define f: ¥ > R 

by setting f(1) = 2 and f(x) = x otherwise. Then f is an embedding of X into 

R, but f does not preserve limit points: If A C X is the subset 


A= | Z nent. 
n+1 


then 1 is a limit point of A in X but f(1) = 2 is nota limit point of f[A] = AinR. 
We say that the reason for this failure is the “discontinuity” of the map f, and a 
map which preserves limit points is called a continuous map: 


Definition 525 (Continuous Maps). Let X and Y be orders. A map f: X — Y is 
called continuous if whenever A C X anda € X isa limit point of A in X, then 
either f(a) € f[A] or f(a) is a limit point of f[A] in Y. 


When an order embedding is onto, it becomes an isomorphism and hence continuous 
(an isomorphism preserves all order notions, including limit points). 


Example 526. Consider the rearrangement of N 
1<3<5<-:-2<6<10<:--4<12<20<:--::: 


of order type w” that we have encountered before, and define f and g by 


n n 
2”"!(2n -1)) = —, 2”—!(2n — 1)) = 2m + ——. 
FQ" 'On-D)=m+ =, g2""'Qn—1)) = 2m + 
Then both f and g embed the above rearrangement of N into R, but while f is 
continuous, g is not. 


Problem 527. Let X and Y be nontrivial orders without endpoints, and f: X —> 
Y. Show that f is continuous if and only if whenever p € X andc < f(p) <d in 
Y, there exista,b € X witha < p < b such that we havec < f(x) <d forall x 
satisfyinga <x <b. 


Limit Points and Suborders: Continuously Embedded 
Suborders 


We now look at the above phenomenon (where an embedding fails to preserve limit 
points) for the special case of suborders. Consider X := (0, 1) U [2, 3) as a suborder 
of R. When X is considered as a suborder by itself, it has order type A and 2 is 
a limit point of (0, 1) in the suborder X. But 2 is not a limit point of (0, 1) in the 
parent order R. 
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Definition 528. Let Y be an order and X C Y bea suborder. We say that the 
suborder X is continuously order embedded in the parent order Y if for any A C X 
anda € X, if a is a limit point of A in X, then a is a limit point of A in the parent 
order Y as well. 


For example, both the suborders A := {n/(n + 1) | n € N} U {1} and B := 
{n/(n + 1)| 2 € N} U {2} of R have order type w + 1, but while A is continuously 
embedded in R, B is not. 


Problem 529. Let Y be an order and X C Y bea suborder. Show that X is 
continuously embedded in Y if and only if the inclusion map from X to Y is 
continuous, where the inclusion map tx: X — Y ts defined as tx(x) = x (for 
all x € X). 


Problem 530. Show that if f: X — Y is an order embedding of the order X into 
the order Y, then f is continuous if and only if f |X] = ran(f) is continuously 
embedded in Y . 


Problem 531. Let X be a suborder of the order Y. Show that X is continuously 
embedded in Y if and only if for any A C X andanya € X, ifa = sup A in X then 
a =supAinY andifa = inf Ain X thena = inf AinY. 


The following theorem, whose proof is left as an exercise, gives sufficient conditions 
for a suborder to be continuously embedded in the parent order. 


Theorem 532. Let X be a suborder of the order Y. 


1. If every point of X is a two sided limit point of X in Y, then X is continuously 
embedded in Y. 

2. If every point of YXX is a two sided limit point of X in Y, then X is continuously 
embedded in Y . 


Problem 533. Let X be an order, A C X, anda € A. 


1. Show that if a is a limit point of A in the parent order X then a is a limit point of 
A in the suborder A. 

2. Give an example to show that if a is a limit point of X (in the parent order X ), 
then a may not be a limit point of A either in the suborder A or in the parent 
order X. 

3. Give an example to show that if a is a limit point of A in the suborder A, then a 
may not be a limit point of X (in the parent order X ). 


Problem 534*. Give an example of an order X anda suborder Y © X such that 
in the suborder Y every point of Y is a limit point, but in the original order X no 
point of Y is a limit point. 


Problem 535. Give an example of an order which contains a suborder of type n, 
but in which every point has an immediate successor and an immediate predecessor. 


Example 536. Let X be a complete order with endpoints in which the only limit 
point is the last element. Then the order type of X ism + 1. 
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Proof. Note that every element (other than the last) has an immediate successor, 
while the last element (being an upper limit) is not the immediate successor of any 
element. Let a, be the first element, az be its immediate successor (which, being an 
immediate successor, cannot be the last element), a3 be the immediate successor of 
a2 (which again cannot be the last element), and so on. Put A = {a, < a2 <...}, 
which has order type w. A is bounded above (by the last element of X), and so 
a = sup A exists and is a limit point in X. Hence a must be the last element. Hence 
the order type of X must be w + 1. Oo 


Problem 537. Let X be a complete order with endpoints and exactly one limit 
point. What are the possible order types that X can have? 

Theorem 538 (Hausdorff). There are exactly 28° = ¢ distinct order types for 
countable orders. 


Proof. Every countable order is isomorphic to an order defined on some subset of N. 
Since every order defined on a subset of N equals (A, R) with A C Nand R © N? 
(so that (A, R) € P(N) x P(N?)), there are at most 


| P(N) x P(N’)| = 2%0. 980 — 980 = ¢ 


such orders. Hence there are at most 28° = ¢ distinct order types for countable 
orders. 
By the Cantor—Bernstein theorem, it suffices to exhibit 28° = ¢ pairwise non- 


isomorphic countable orders. Consider order types of the form: 


So ay =o FO Hee $y) Hee, 


neN 


where each a, is either w + 1 or 1 + *w. Any order X with order type of this form 
is countable and the set of limit points in X will form a suborder of X of type w, 
hence the limit points of X can be listed in a sequence as “‘the first limit point,” “the 
second limit point,’ and so on. 

Now note that in an order of the above type there is no two-sided limit point: The 
n-th limit point will be a one-sided upper limit point if a, = @ + 1 and will be a 
one-sided lower limit point if a, = 1+ *w. 

Thus given an infinite binary sequence (b;,b2,...,b,,...) in 2N, we can set 
a, = 1+ *wifb, = O0anda, = w+ 1ifb, = | to get an order of the above type 
in which the n-th limit point is a lower limit point if b,, = 0 and is an upper limit 
point if b, = 1. Distinct binary sequences will give non-isomorphic orders, since in 
two isomorphic orders the n-th limit points (if they exist) will be of the same type. 
This gives 28° = ¢ non-isomorphic countable orders. oO 
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The rational numbers with the usual ordering form an order which is countable, 
order dense, and without endpoints. A remarkable result of Cantor asserts that, up 
to isomorphism, it is the only one with those properties. 


Theorem (Cantor). If X and Y are orders both of which are countable, order 
dense, and without endpoints, then X and Y are similar orders. Hence any order 
which is countable, order dense, and without endpoints is isomorphic to the rational 
numbers Q with the usual ordering, and so must have order type 7. 


This gives a characterization of the order type 7 in terms of its structural properties. 


Definition 539 (Finite Partial Isomorphisms). Let A and B be orders. By a finite 
partial isomorphism between A and B we mean an order isomorphism (order 
preserving bijection) between a finite suborder of A and a finite suborder of B. 
In other words, a finite partial isomorphism between A and B is a bijection 
f:E — F where E is a finite subset of A, F is a finite subset of B, and for 
allx,ye E,x <4 y > f(x) <p f(y). 


Problem 540 (Extension Lemma for Finite Partial Isomorphisms). Let A and 
B be orders, E be a finite subset of A, F be a finite subset of B, and f: E —> F be 
a finite partial isomorphism between A and B. Prove that 


1. If B is order-dense without endpoints and a € A then there is a finite partial 
isomorphism g extending f with a € dom(g), i.e., there are finite subsets P © 
A, Q C B, and an order-preserving bijection g: P — Q such that g|p= f 
anda eé P. 

2. Similarly, if A is order-dense without endpoints and b € B then there is a finite 
partial isomorphism h extending f with b € ran(h), i.e., there are finite subsets 
P CA, QC B, and an order-preserving bijection h: P > Q such that h| z= 
fandbeQ. 


Theorem 541 (Cantor’s Theorem on Countable Dense Orders). Any two count- 
able dense orders without endpoints are isomorphic to each other. 


Proof. Let A and B be countable dense orders without endpoints, and enumerate 
the elements of A and B as 


A = {a1,d2,...,Qn,...}, B = {bj,bo,...,bn,...}- 


We will inductively define a sequence ff C fp C --- C f, C --: of finite 
partial isomorphisms between A and B with each f,, extending the preceding ones, 
dom(f,) > {a1,...,@n}, andran(f,) D {b1,..., Dn}. 

Let fi be the function with dom(/\) = {ai} and f(a1) = 1, so that ran( fi) = 
{b,}. Then /f; is (trivially) a finite partial isomorphism between A and B. 

Suppose next that f, is defined with dom(f,) > {a1,...,d,} and ran(f,) 2 
{b1,...,b,}. Since B is dense we can apply the first part of the extension lemma 
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to extend f, to a finite partial isomorphism g between A and B such that 
Gn+1 € dom(g). Next, since A is dense, we apply the second part of the extension 
lemma to extend g to a finite partial isomorphism g between A and B such that 
by+1 € ran(h). We then put f,4) = h. Then f,4; extends f,, with dom(f,+1) > 
{Q1,...,4y, 4,41}, and ran(f,41) D {b),...,b,,b,41}, completing the inductive 
construction. 

Now let f = U, f,. Then f is a well-defined function: If x is in dom(fj,) 9 
dom(f,) then f(x) = fin(X) since either f, extends f, (form <n) or fn extends 
Jn (for n < m). The function f is an extension of every f. Moreover, dom(f) = A 
and ran(f) = B since a, € dom(f,) C dom(/) and b, € ran(f,) C ran(f) for 
all n. Finally, if x <4 y in A, then x = dy» and y = a, for some m,n, and with 
k = max(m,n) we get fx(x) <p fk(y) (since fx is a finite partial isomorphism) 
and so f(x) <g f(y). Thus f is strictly increasing map from A onto B and hence 
an order isomorphism between them. Oo 


Corollary 542. Any countable dense order without endpoints has order type n, and 
is isomorphic to the set of rational numbers with their usual ordering. 


Corollary 543. Any nontrivial countable dense order has one of the order types n, 
l+ynn+1Lorl+nt+1. 


Corollary 544. 7 + 7 = n. Also, n+1+7 =n, and nn = 7. 


The last corollary follows from the observation that if a is the order type of a dense 
order without endpoints, then an order having any other types a + a,a@ + 1+ a, or 
aa will also be a dense order without endpoints. 


Both Cantor’s theorem and the “back-and-forth” method used to prove it are 
very powerful and have far-reaching implications. We will illustrate this by using 
Cantor’s theorem to prove (in a later chapter) two classical theorems: Brouwer’s 
Theorem and Sierpinski’s Theorem. 


A related result which is easier than Cantor’s theorem is that any countable order 
can be order embedded in any dense order. 


Theorem 545. Let A be a countable order and let B be order-dense without 
endpoints. Then A can be order embedded in B, that is, A is isomorphic to some 
suborder of B. 


Proof. It suffices to show that there is a strictly increasing f: A > B. 

Enumerate A = {a1,d2,...,@n,...} and repeatedly apply the extension lemma 
(first part) to inductively get a sequence fi C fo C---C f, C... of finite partial 
isomorphisms between A and B with dom(f,) > {a1,...,an}. Then the function 
f = Un Sr is a strictly increasing map from A into B. Oo 


Corollary 546. If X is a nontrivial and order dense, then every countable order is 
order-isomorphic to some suborder of X. Hence the collection of all order types of 
suborders of a nontrivial dense order includes all countable order types. 
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Corollary 547. Any order X of type n is a universal countable order, that is, X is 
a countable order in which every countable order can be embedded. 


Corollary 548. There are at most 28° = ¢ countable order types. 


Problem 549. Let E be the set of endpoints of the open intervals removed in the 
construction of the Cantor set, with the usual inherited order. Express the order type 
of E as the sum and/or product of some known order types. 


Problem 550 (The Kleene-Brouwer Order). Let N* be the set of all finite 


sequences (strings) of natural numbers, and say that u = (uy, U2, ...,Um) precedes 
v = (v1,V2,...,Vn) in the Kleene-Brouwer order on N* if either m > n and 
uz = ve for all k <n, or there is k < min(m,n) with up < vg but uj = v; 


for all j < k. In other words, u precedes v in the Kleene—Brouwer order on N* 
if either u properly extends v, or none of them is an extension of the other and u 
lexicographically precedes v. What is the order type of this order? 


Theorem 551. Every countable order X can be continuously embedded in Q 
and in R. 


Problem 552. Prove Theorem 551. 


{Hint: Between every pair of consecutive points of X, “adjoin” an additional set of 
points of order type 7. The extended order is a countable dense order in which X is 
continuously embedded. ] 


However, there are dense orders without endpoints in which no countable order 
having limit points can be continuously embedded (Problem 757). 


8.5 No < c: Another Proof of Uncountability of R 


A notable consequence of Cantor’s theorem (Theorem 541) is that every nontrivial 
countable dense order has Dedekind gaps. Since a linear continuum is a dense order 
without gaps, it follows immediately that every continuum, and so R in particular, 
must be uncountable. 


Proposition 553. Any nontrivial countable dense order has Dedekind gaps. 
Consequently, every linear continuum is uncountable. 


Corollary 554 (Uncountability of R). R is not countable, i.e., ¢ > Xo. 


Proposition 553 is easily derived from Cantor’s theorem in various ways. We give 
three short proofs. Let X be a nontrivial countable dense order. We may assume that 
X has no endpoints (by removing them if necessary). 


Proof (First Proof). By Cantor’s theorem X is isomorphic to the ratios (positive 
rationals), and we had seen that the ratios contain the gap given by {p| p? < 2} and 
{p| p? > 2}, hence X¥ must contain a gap too. oO 
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Proof (Second Proof). By Cantor’s theorem X must have order type 7 and another 
corollary of Cantor’s theorem was that 7 = 7 + 7. But any order whose type is 
expressible as w + a, where a is the type of a nonempty order without endpoints, 
must have a gap. Oo 


Proof (Third Proof). If we remove a point p from X, the resulting order X ~ {p} 
is still countable, order dense, and without endpoints, and so by Cantor’s theorem 
X ~{p} is isomorphic with X. But if a point is removed from a dense order without 
endpoints, the resulting order will have a gap, and so X,, being isomorphic to XX{ p}, 
will have a gap too. oO 


We thus get a very short proof of uncountability of R by exploiting the power of 
Cantor’s isomorphism theorem on countable dense orders (Theorem 541). 


Remark. In his very first proof of uncountability of R, Cantor directly showed that 
every countable dense order has gaps, without using the isomorphism theorem on 
countable dense orders. (Nor did he use the diagonal method which he invented 
later). Cantor’s first proof is given in Appendix A. 


The identity 7 + 7 = n can also be proved without using Cantor’s theorem. 


Problem 555. Let Qt := {r € Q| r > 08, Q := {r € Q| r < 0}, and 
Q* := Q~ {0} = {r € Q| r F O}. Prove the identity n + n = n by finding 
three explicit order preserving bijections: The first between Q and Q*, the second 
between Q and Q~, and the third between Q and Q*. 


8.6 The Order Type of R 


We saw that the order property “countable, dense, and without endpoints” charac- 
terizes the order type 7: An order has type n, or is isomorphic to Q with the usual 
order, if and only if it is countable, dense, and has no endpoints (Cantor’s theorem). 

There is a similar characterization, also by Cantor, of the order type A of the real 
numbers. An important property of the order on the reals is that it is a continuum 
without endpoints. However, this is not sufficient to characterize the order type A 
of the reals, and we will now give some examples of continuums without endpoints 
which are not isomorphic to R. First, we need a definition. 


Definition 556 (CCC orders). An order X is said to satisfy the countable chain 
condition, or is called a CCC order, if any family of pairwise disjoint nonempty 
open intervals in X is a countable family. 


Clearly, a non-CCC order cannot be embedded in a CCC order. 

The real line satisfies the countable chain condition, since given any family of 
pairwise disjoint nonempty open intervals in R, we can pick a rational number in 
each interval in the family. Distinct rationals will be picked for distinct intervals 
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since the intervals are pairwise disjoint, which gives a one-to-one correspondence 
between the family and some set of the rational numbers, and so the family must be 
countable. 

The following is an example of a non-CCC continuum. 


Problem 557. Consider the subset S = (0,1) x [0,1] of the plane ordered 
lexicographically. (S is the subset obtained by removing the left and right edges 
of the closed unit square.) 


1. Verify that the order type of S (ordered lexicographically) is (1 +A + 1)A. 

2. Prove that S is a continuum without endpoints. 

3. Show that S is not a CCC order. 

4. Conclude that S is not isomorphic to the real numbers with the usual ordering, 


andso(l+A+1)A# A. 


Thus the non-CCC continuum S$ cannot be embedded in the CCC continuum R, 
and any order of type (1 + A + 1)A is a continuum without endpoints which is not 
isomorphic to the real continuum. More examples can be obtained by iterating the 
above procedure. For example, an order of type (1+A+1)7A is a continuum without 
endpoints which cannot be embedded even in S$; see Problem 558. 


Problem 558. Let A, denote the order type (1 + A + 1)*A (with Ap = A). 


1. Show that each order of type Ay is a continuum without endpoints (k = 
0,1 2yscce): 

2. Show that ifm <n then an order of type 4, cannot be embedded in an order of 
type Ain. 

3. Conclude that orders having the distinct types Ao,A\,... must be non- 
isomorphic continuums without endpoints. 


A property which is stronger than CCC is separability. 


Definition 559 (Separable Orders). An order X is called separable if it contains 
a countable subset dense in it (i.e., if there is a countable C C X with ¥ = C U 


D(C)). 


Recall that a subset A of an order X is dense in X if and only if every nonempty 
open interval in X contains a point of A. Thus every separable order is CCC (the 
same proof given above showing R is CCC works). 

If X is order dense, then a subset A is dense in X if between any two points of 
X there is a point of A. Thus an order dense order X is separable if and only if X 
contains a countable subset C such that between any two points of X there is a point 
of C. For example, the rationals Q form a countable subset of R with this property, 
and so R is separable. On the other hand, S = (0,1) x [0,1] with lexicographic 
order is a non-separable continuum. 

The property of “being a separable continuum without endpoints” characterizes 
the ordering of the reals and the order type 1: 
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Theorem 560 (Cantor’s Order Characterization of R). Every separable contin- 
uum without endpoints is isomorphic to the reals with their usual ordering. 


Proof. Let X be a separable continuum without endpoints and let C be a countable 
dense subset of X. Then the suborder C is a countable dense order without 
endpoints, and so by Cantor’s theorem there is an order isomorphism f:C — Q. 

Regarding Q as a suborder in R, we see that f extends (uniquely) to an order 
isomorphism g between X and R, where for x € X XC, we set: 


g(x) = sup{ f(u)| ue C andu < xin X}. 
R 


Using the density of C in X and of Q in R, it is readily verified that g is a bijection 
from X onto R which preserves order. Oo 


Corollary 561. An ordering has order type 4 (i.e., it is order isomorphic to the set 
of real numbers with their usual ordering) if and only if it is a separable continuum 
without endpoints. 


Corollary 562. Any separable linear continuum has one of the order types A, 1+A, 
A+ l,orl+A+4+1. 


Problem 563. For each of the following order types determine if it is separable, 
dense, and/or Dedekind complete, and identify the ones which are linear continu- 
ums. If any of them is identical to a familiar type, indicate so. 


5 ay 7. (1+A)o3 
2.(1+A)* 8& (A+ 1)a 
3. (1+A)j)A 9. 9? 
4.(14+4+1) 10. (1+n)n 
5. A@ Il, n2 

6. (1+A)@ 12. 2n 


{Hint: It may be useful to represent each type as a lexicographic product of 
familiar orders. For example, (1 + A) A has the order type of (0, 1) x [0, 1) ordered 
lexicographically, and A w has order type of N x (0, 1) ordered lexicographically 
(note reversal of order of the factors). ] 


Problem 564. Let us call a point in an order to be a removable point if the suborder 
obtained by removing this single point is order-isomorphic to the original order. 
In other words, the point p in the order X is removable if XX p} is order isomorphic 
to X. 

For each of the following orders, determine which points are removable. All 
orderings are assumed to be their usual orders, or inherited suborder from the usual 
order. 


ILN. 


3Q 
oie Fi 4.R. 
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5. The unit interval [0, 1]. 9. An order of type w* + o. 

6. The set {0} U {1 | 2 € N}. 10. An order of type An. 

7. The set {0} U (Unen{-3, iy). 11. An order of type w? +n (n €N). 
8. The set Unen{-2, 1}. 12. An order of type wa, a arbitrary. 
The Suslin Problem 


Once it is established that a separable continuum without endpoints must be isomor- 
phic to the real line, the question arises if the result remains true if separability is 
replaced by the weaker condition of being CCC. This was first asked by Suslin. 


The Suslin Problem. Is a CCC continuum without endpoints necessarily order 
isomorphic to R? 


The affirmative answer to Suslin’s question is known as the Suslin Hypothesis 
(SH). Thus SH is the statement that every CCC continuum without endpoints has 
order type A. Like the Continuum Hypothesis, SH is independent of comprehensive 
set theoretic axiom systems for developing mathematics such as ZFC (Zermelo— 
Fraenkel Axioms with Choice). This means it has been proved that SH can neither be 
proved nor be disproved using mathematical principles and methods of proof that are 
currently accepted as standard (assuming these methods themselves are consistent). 

The Suslin Problem has played an important role in the development of axioms 
and principles of combinatorial set theory (such as constructibility and Jensen’s 
diamond and box axioms) as well as independence proofs (Martins’s Axiom and 
forcing). 


8.7 Dedekind Completion 


Definition 565 (Dedekind Completion). We say that an order Y is a Dedekind 
Completion of an order X if X is a suborder of Y, Y is Dedekind complete, and 
every element of Y ~ X is both an upper limit point of X and a lower limit point 
of X. 


Problem 566. Prove that if Y > X is a Dedekind Completion of X, then X is 
dense in Y in the sense that any nonempty open interval in Y must contain a point 


of X. 


Theorem 567 (Existence of Dedekind Completion). Every order X has a 
Dedekind completion. 


Proof. The proof is a straightforward generalization of Dedekind’s construction of 
the real numbers using cuts of rational numbers. 
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Let X be any order. Without loss of generality, we first replace X by an 
isomorphic order E(X ): 


E(X) := {Pred(a)| a € X}, 


where Pred(a) := {x € X | x < a}. Let E(X) be ordered by the proper set 
inclusion relation. Then the mapping a +> Pred(q) is an order isomorphism from X 
onto E(X). 

Now let 


H(X) := {L| (L, X ~L) is a Dedekind gap in X}, 


and put M(X) := E(X) U H(X), ordered again by proper set inclusion. 
Then M(X) is a Dedekind Completion of E(X): Here E(X) plays the role of the 
rationals and H(X) plays the role of the irrationals given by Dedekind gaps. Oo 


Problem 568 (Uniqueness of Dedekind Completion). [f the order A is isomor- 
phic to the order B via the order-preserving bijection f: A — B, and if A' and B' 
are Dedekind completions of A and B, respectively, then there is a unique extension 
f' 2 f which is an order isomorphism between A’ and B’. 


[Hint: For x ¢ A’ A, define f’(x) := supa { f(u)| ue A, u <y x}.] 


The last result implies that for any order type, there exists a unique order type for 
its Dedekind Completion. 


Definition 569 (Dedekind Completion of Order Types). If t is an order type, 
then the Dedekind Completion of t, denoted by 7, is the unique order type 
determined by some (or every) Dedekind completion of an order of type Tt. 


Problem 570. Find the Dedekind completions of each of the following types, 
expressing your answer in terms of known types: 


l.@ & 0 

2. § 9. f4%N 

3. 10.4 +n 
4.0+*o 11. 42 

5. *o+@ 12. (1 +A)? 

6. w. 13. (1+4A+1) 
7.E4+€¢ 14, 2n 


Problem 571. The Dedekind completion of any dense order is also a dense order, 
and hence, being complete as well, is a continuum. 


Problem 572 (Continuous Embedding in Dedekind Completion). Every order 
X is continuously embedded in its Dedekind completion Y, i.e., the inclusion 
(identity) map from X into Y is a continuous embedding. 
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The following shows that the Dedekind completion of X is the “smallest complete 
order containing X”’: 


Problem 573 (Minimality of Dedekind Completion). /f X C Y where Y is a 
complete order, then Y contains suborder (containing X ) which is isomorphic to 
the Dedekind completion of X. 


[Hint: Given a Dedekind completion X’ of X, associate each x € X’~ X with the 
element supy{u € X|u<y x}of Y.] 


Corollary 574. Every linear continuum contains a suborder isomorphic to the 
real line (with the usual order). Hence every linear continuum has cardinality 
> 280 = ¢, 


Proof. Let X be a linear continuum. Since X is order-dense, it contains a suborder 
of type 7, and being complete X contains a suborder of type 7 = 2. Oo 


Problem 575. Let A be the set consisting of all infinite binary sequences which are 
eventually constant, except the two sequences “all zeroes” and “all ones,” ordered 
lexicographically. Let B be the set of endpoints of the open intervals removed in the 
construction of the Cantor set, with the order inherited from the usual order on R. 


1. Show that these two orders are order isomorphic, with each having order type 2n. 

2. Show that each point of A has an immediate neighbor in A, that is, show that 

given any x € A there is a pair of consecutive elements in A with x being one of 

them. 

Show that A is dense-in-itself, that is, every element of A is a limit point of A. 

4. Let C be the set of points of the Cantor set except 0 or 1. Show that C is a 
Dedekind completion of B. 


ye 


Problem 576. Prove that the order type of the Cantor set is 1 + In +1. 


(In fact, from a result of Brouwer to be proved later (Theorem 1099), it follows that 
any bounded perfect subset of R which does not contain any interval has order type 
1+2n+ 1.) 


8.8 Properties of Complete Orders and Perfect Sets 


Bolzano—Weierstrass and Nested Intervals Properties 


We say that an order X has the Bolzano—Weierstrass property (or BW property) if 
every bounded infinite subset of X has a limit point in X. 


Theorem 577 (Bolzano-Weierstrass). For an arbitrary order, completeness 
implies the Bolzano—Weierstrass property. 
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Proof. Let E be a bounded infinite subset of the complete order X, saya < E <b 
for some a,b € X. Put 


E,:={yeE|y<x}, and L:={xe X| E, is finite}. 


Then £, is finite but E, is infinite, soa €¢ L and L < b, hence c := supy L exists. 
If now c € L, then c is a lower limit point of EF: Since E, is finite, c < b, and 
for any z > c E, is infinite and so E,~ E, is infinite, so there is some p € E with 
c<p<z 

Ifc ¢ L, then c is an upper limit point of E: As L < c and for any x < c there is 
y € Lwith x < y <c,so E,.~E, is infinite hence there is p € E with y < p <c, 
andsox < p <c, Oo 


There are two nested intervals properties (“NIP’’s) that we will consider. 


1. We say that an order X satisfies the sequential nested intervals property if given 
any nested sequence of nonempty bounded closed intervals in X 


2D DIh2DIn4i 2°: , 


we have N, J, 4 @. 

2. An order is said to satisfy the strong nested intervals property if whenever a 
family F of nonempty bounded closed intervals forms a chain (i.e., for any two 
intervals [, I, € Feither J; C Jy or In C 11), we have NF 4 ©. 


Trivially the strong nested intervals property implies the sequential nested intervals 
property. The following two theorems show how the two nested intervals properties 
are related to completeness and the Bolzano—Weierstrass property. 


Theorem 578 (“BW Implies NIP’). [In an arbitrary order, the Bolzano- 
Weierstrass property implies the sequential nested intervals property. 


Proof. Let I, = [dy,b,] (n = 1,2,...) be a nested sequence of nonempty closed 
intervals in a complete order X, so that 


a, <a. 5.2.6.0) Sng, <... te Sng Sdn <...b. < dy. 


Now either the sequence (a, | n € N) is eventually constant so that there exista € X 
and k € N with a, = a for all n > k, in which case a € M,J,. Or else, the set 
L := {a, | n € N} of left endpoints of the intervals J, is a bounded infinite set and 
so L has a limit pointc € X. Then c € NyJy, since if c < a, orc > b, for somen 
then c cannot be a limit point of L. Oo 


Theorem 579 (The Strong Nested Intervals Property). For an arbitrary order, 
completeness implies the strong nested intervals property. 


Proof. Let A be the set of left endpoints of the intervals in F. Since F is a chain, any 
right endpoint of any interval in F is an upper bound of A, and so a := sup A exists. 
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Then a > p for any left endpoint p of any interval in F. Also, a < q for any right 
endpoint g of any interval in F. Hence a € MF. Oo 


The above results can be summarized as: 


Bolzano—Weierstrass Property 


aa \ 


Completeness Sequential NIP 


\ a 
Strong NIP 


None of the implications above can be reversed and no further implications between 
the above properties can be obtained. The Bolzano—Weierstrass and Nested Interval 
properties are strictly weaker than completeness. In fact, there is an order which 
satisfies both the Bolzano—Weierstrass and the Strong Nested Interval properties 
but is not complete, and there is an order which satisfies the Sequential NIP but 
satisfies neither the Bolzano—Weierstrass nor the Strong Nested Intervals properties. 
Moreover, the Bolzano—Weierstrass and the strong Nested Interval properties are 
independent of each other. For these counterexamples, see Problem 711 in Chap. 10. 


Problem 580. Show that in a dense order which is separable, all four properties 
displayed above are equivalent, and therefore completeness is characterized by any 
of the other three properties. 


Definition 581 (Monotone and Convergent Sequences). Let X be an order and 
let (x,) = (x, | €N) be a sequence of points in X. We say that the sequence 
(X,) is monotone increasing if X, < Xn+1 for all n. Similarly one defines the notion 
of monotone decreasing sequences. A monotone sequence is one which is either 
monotone increasing or monotone decreasing. We also say that the sequence (x,) 
converges to a point p € X if for any a < p there exists k such that x, > a for all 
n > k and for any b > p there exists k such that x, <b foralln > k. 


Problem 582 (The Monotone Convergence Property). Show that in any order, 
the Bolzano—Weierstrass property is equivalent to the condition that any bounded 
monotone sequence converges to some point. 


Densce-in-Itself Orders 


Recall that an order X is a dense order if and only if every point of X except the first 
(if present) is an upper limit point if and only if every point of X except the last (if 
present) is a lower limit point. In particular, if every element of X is an upper limit 
point or if every element of X is a lower limit point then X must be order-dense. 
But if we are given that every element of X is either an upper- or a lower limit point, 
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then, as we will see soon, X may have consecutive points, and may in fact be quite 
far from being order-dense. We thus get a wider class of orders using the condition 
“every point of the order is a limit point, upper or lower”: 


Definition 583 (Density-in-Itself). A subset A of an order X is called dense-in- 
itself if every element of A is a limit point of A (in X), that is if A C D(A). 
An order is called dense-in-itself if it is dense-in-itself as a subset of itself. 


Every dense order is dense-in-itself, but there are dense-in-itself orders which are 
not order-dense. 

Consider the suborder X = (0, 1] U [2, 3) of R as an order by itself. Then X is 
dense-in-itself but not order-dense as there is no element in X which is between 1 
and 2. Another example of an order which is dense-in-itself but not order-dense is 
the Cantor set K as an order by itself: There is no element in K between the two 
elements 1/3 and 2/3. 

We can get dense-in-itself orders with lots of consecutive points. The following 
examples are orders in which every point is a limit point yet every point is one of 
the two points of a pair of consecutive points. 


Problem 584. Let X = {0,1,2,3,4,5,6,7,8, 9}N be the set of “infinite decimal 
strings” ordered lexicographically, and let Y be the suborder consisting of those 
members of X which are either eventually 0 or eventually 9 but not all 0 or all 9. 
Let E be the suborder of R consisting of the endpoints of the open intervals removed 
in the construction of the Cantor set. Show that: 


1. Y and E are isomorphic orders and have order type 2n. 

2. Any order of type 2n (like Y or E) is dense-in-itself (every point is a limit point). 
3. In any order of type 2n, every point has either an immediate successor or an 
immediate predecessor, and consequently no point is a two-sided limit point. 

4. Any countable dense-in-itself order without endpoints and without any two-sided 

limit point must have order type 2n. 


Problem 585. Consider the plane set T := (0,1) x {0,1} = {(x,y)|0<x < 
1,y¥ = Oor 1} be ordered lexicographically, which has order type 2A. Show that 
such an order is complete, separable, dense-in-itself, yet every point is one of the 
two points of a pair of consecutive points. 


The order of the last example cannot be embedded in R, giving us an example of a 
separable complete order which cannot be embedded in R. 


Problem 586. Show that the set X = {0,1,2,3,4,5,6,7,8,9}% of “infinite 
decimal strings” ordered lexicographically is order isomorphic to the Cantor set 
(as a suborder of R). [Hint: Show that X is a Dedekind completion of the suborder 
consisting of stings which are eventually 0 or eventually 9, which has order type 
14+2n+1.] 
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Problem 587. Let A be a subset of an order X. 


1. Show that if A is a dense-in-itself subset of X, then the suborder A as an order 
by itself is a dense-in-itself order. 
2. Give an example to show that the converse of the above fails. 


Problem 588. Let X be a nontrivial order of order type y. Show that X is not 
order-dense if and only if y = a + 2+ B for some order types a, B, and X is not 
dense-in-itself if and only if y = a +3 + B for some order types a, B. 


Problem 589. Let C be the set of all initial segments of the set Q of rational 
numbers. Thus C includes sets of the form {x € Q| x < r} (Fr rational), 
{x € Q| x < r}(r rational or irrational), as well as © and Q. Show that C 
ordered by proper set-inclusion is order-isomorphic to the Cantor set with the usual 
order. 


Theorem 590. Let X be an order which is complete and dense-in-itself. Then R 
can be order embedded in X and so the cardinality of X is at least 2*° = ¢. 


Proof. Let X be an order which is complete and dense-in-itself, and let A be subset 
of X consisting of all lower limit points of X. Then the suborder A is order-dense, 
that is between any two points of A there is another point of A: If x, y € A with 
x < y then since x is a lower limit point of X there exist u,v € X with x <u < 
v < y. Now either u € A or uw is not a lower limit point, in which case u will have 
an immediate successor in X, say u’, with u < u’ < v < y. Being an immediate 
successor, the element u’ is not an upper limit point and so must be a lower limit 
point in X. Thus either u € A or’ € A, and in either case we have a point of A 
between x and y. 

It follows that A and so X must contain a subset of order type yn. But X is 
complete and so by minimality of Dedekind completion, X will contain a subset 
of order type 7 = A. Oo 


Problem 591. Give an example of a bounded subset A of R such that no point of A 
is a limit point of A but the suborder A has order type n. 

Show that for such an A, D(A) must have order type | + In + land AU D(A) 
(called the closure of A) must have order type 1 + 3n + 1. 


Closed and Perfect Sets 


Definition 592. Let X be an order and A C X. We say that A is closed in X if A 
contains all its limit points, that is, if D(A) C A. A subset which is both closed and 
dense-in-itself is called perfect. 


Theorem 593. Let X be a complete order and let A C X. Then A is closed in X if 
and only if the following two conditions both hold: 


1. The suborder A is continuously embedded in X. 
2. A is complete (as an order by itself). 
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Problem 594. Prove Theorem 593. 


Problem 595. Give an example of closed suborder of a general order which is 
neither continuously embedded (in the parent) nor complete (by itself). 


Problem 596. Let X be a complete order. Show that F C X is closed if and only if 
for any bounded nonempty subset A C F we have both sup A € F andinf A € F. 


[Hint: Use the fact that if sup E exists but sup E ¢ E then sup E must be an upper 
limit point of F.] 


Problem 597. Let X and Y be orders and let f: X — Y be continuous. Show that 
for any b € Y, the set {x € X| f(x) < b} is aclosed subset of X. 


Cardinalities of Perfect Sets in a Complete Order 


Since a perfect subset of a complete order, taken as an order by itself, must be dense- 
in-itself and complete, it follows from Theorem 590 that any nonempty perfect 
subset of a complete order contains a suborder isomorphic to R, and so must have 
cardinality > c. 


Theorem 598. Every nonempty perfect subset of a complete order has cardinality 
> 280 = ¢, 


The notions of closed, dense-in-itself, and perfect sets are due to Cantor. 


8.9 Connectedness and the Intermediate Value Theorem 


Recall Dedekind’s basic definition of the continuum: An order is a continuum if 
and only if for any Dedekind partition of the order into nonempty upper and lower 
segments, at least one of the segments will contain a limit point of the other. We will 
now see that for any partition of a continuum into two nonempty disjoint sets, at least 
one of them will contain a limit point of the other. Thus this last stronger condition, 
known as topological connectedness, also characterizes linear continuums. We can 
state the result equivalently in terms of partitions into a pair of closed sets as follows. 


Theorem 599. Let X be a continuum. If A and B are disjoint closed subsets of X 
with AU B = X, then either A= @® and B = X or A= X and B = @. 


Proof. Suppose that A and B are disjoint closed subsets of X with AU B = X. 
To get a contradiction, assume that both A and B are nonempty and fix a € A and 
b € B. Without loss of generality we assume that a < b. Put 


E:={xeBla<x<b}, c:=infE, and D:={xeX|a<x<c}. 
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Note that c = inf F exists since F is a nonempty bounded set in the complete order 
X, and thus FE, c, and D are all well defined. Also note that E C Band D C A. 
Since EF C B and B is closed, we have c = inf(E) € B. Therefore a < c,andso D 
must be an infinite set with no maximum (since X is a dense order). It follows that 
c = sup D must be a limit point of D, and hence of A, soc € A (as A is closed). 
But this is a contradiction since AN B = @. oO 


Recall that a subset F of an order is called a segment if whenever u < z < v and 
u,vé€ E thenze E. 


Corollary 600 (The Intermediate Value Theorem). /f X is a continuum, Y is any 
order, and f: X — Y is continuous, then ran(f) = f[X] is a segment in Y. 


Proof. To get a contradiction assume the conditions of the theorem and suppose that 
F [X] is not a segment, so that there exist u < z < vin Y such that u,v € f[X] but 
z¢ f[X]. Put A := {x € X| f(x) < z}and B := {x € X| f(x) = z}. Then A 
and B are nonempty disjoint closed sets in X with A U B = X, which contradicts 
Theorem 599. oO 


We summarize these results as characterizations of the continuum. 
Problem 601. Let X be an order. Then the following are equivalent. 


1. X is acontinuum: X is a dense order without Dedekind gaps. 

2. X is topologically connected: X cannot be partitioned into two disjoint 
nonempty closed sets. 

3. X satisfies the Intermediate Value Theorem: For any order Y and any continuous 
f:X — Y, the image f |X] = ran(f) is a segment in Y. 


Chapter 9 
Well-Orders and Ordinals 


Abstract This chapter develops the classical theory of well-orders and ordinals 
in a naive setting. Ordinals are defined as order types of well-orders, not as 
von Neumann ordinals. We cover the basic ordinal operations of sum and product, 
transfinite induction and recursion, uniqueness of isomorphisms and ranks, unique 
representation of well-orders by initial sets of ordinals, the comparability theorem 
for well-orders, the division algorithm, remainder ordinals, ordinal exponentiation, 
and the Cantor Normal Form. 


9.1 Well-Orders, Ordinals, Sum, and Product 


Cantor invented two remarkable generalizations of the natural numbers extending 
into the transfinite. One is the notion of cardinal numbers: Two sets have the same 
cardinal number if their elements can be put in a one-to-one correspondence— 
without any regard for the ordering between the elements themselves. The other 
is the notion of an ordinal number, which, roughly speaking, represents the “serial 
position, relative to the beginning, of an object in a que.” The number 10 can be used 
either as a cardinal number (as in “there are 10 students in the room’’) or as an ordinal 
number (as in “I am the 10-th person waiting in line’). The distinction becomes 
sharper if we imagine an ordered infinitely long endless que of people, where each 
person in the line is the n-th person from the beginning (n = 0, 1,2,...). The que 
has no end, but we can imagine a new person joining in at the end (with infinitely 
many people ahead), occupying the “w-th position” in the que. Here the serial or 
ordinal position of any person is defined as the order type of the part of the que 
preceding that person. Still another person can be adjoined to the end, who is then 
called the wm + 1-st person. The serial positions of these last two people, w and 
w + 1, are different ordinal numbers, but in the cardinal sense they both have Xo 
people ahead. The process of extending such ordered ques by adding new members 
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at the end while preserving the ordering of the preceding part can be continued 
indefinitely through the transfinite. The type of orders that can be generated in this 
way are the well-orders. 


Criteria for Well-Ordering 


Given an order X anda ¢€ X, the set of predecessors of a in X will be denoted 
by Predy (a), or simply by Pred(a) when the order X is clear from context. For a 
subset A of X, we say that a is an immediate successor of A if A < a and there is 
no b with A < b < a. Note that for a € X the immediate successor of Pred(a) is 
a. In particular, the immediate successor of the empty set is the first element of the 
order (if the order does not have a first element then the empty set does not have an 
immediate successor). 


Theorem 602 (Equivalent Conditions for Well-Ordering). Let X be anonempty 
linear order. The following conditions are equivalent: 


I. X is a complete order with a first element, in which every element except the last 
(if present) has a next element. 

2. X is a complete order with a first element but without any lower limit point. 

3. X has a first element, and every Dedekind partition is either a jump or an upper 
limit cut (i.e., there are no gaps or lower limit cuts). 

4. Every proper initial segment in X is an initial open interval: If A & X is an 
initial segment then A = Pred(a) for somea € X. 

5. Every non-cofinal subset of X has an immediate successor. 

6. Every nonempty subset of X contains a least element. 

7. (DC) There is no strictly decreasing infinite sequence in X, i.e., X no suborder 
of type *w. 


Problem 603. Prove Theorem 602. 


{Hint: The implications | 2=> 3 4>5>6 7 are all routine. For 
7 => 6, note that if A C X is nonempty but has no minimal element, then the 
relation R defined on A by xRy < x > y satisfies the condition of DC.] 


Definition 604 (Well-Orders and Ordinals). 


¢ A well-order is an order which is either empty or satisfies any (and so all) of the 
conditions of the last theorem. 

e If X is any order and A C X we say that A is well-ordered by the parent order 
(or simply that A is a well-ordered subset of X) if the suborder on A, inherited 
from the parent order on X, is a well-order. 

e An ordinal is the order type of a well-order. 
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From Part 4 of Theorem 602 we immediately have 


Corollary 605. Let X be a well-order with order type a. The set of all proper initial 
segments of X, ordered by set inclusion, is isomorphic to X. The set of all initial 
segments of X has order type a + 1. 


Theorem 606. Every subset (suborder) of a well-order is a well-order. Every finite 
linear order is a well-order. Any order of type w, such as the positive integers N with 
their usual ordering, is a well-order. 


Corollary 607. Every finite order type n is an ordinal, and w is an ordinal. 


Problem 608. Let X be an order with X = AUB, where A < B. If each of A and 
B is well-ordered by the parent order on X, then X itself is a well-order. 


Corollary 609. The sum of two ordinals is an ordinal. 
The above Theorem is a special case of the following more general fact. 


Problem 610. Let X be an order which is expressed as the union of a finite number 
of subsets, say as X = U!_,X;. If each X; is a well-ordered subset of X (i = 
1,2,...,n) then X is a well-order. 


Using the last corollary, we get more examples of ordinals, such as mw + 7 (n = 
1,2,...),@2=ow+0,@2+n (n= 1,2,...),@3 = 02+ 0, etc. 


Problem 611. [Jf A and B are well-orders then so are B x A and A x B, under 
lexicographic ordering. 


Corollary 612. The product of two ordinals is an ordinal. 


Definition 613 (Limit and Successor Ordinals). An ordinal is called a limit 
ordinal if it is the order type of some nonempty well-order without a last element. 
An ordinal is called a successor ordinal if it is the order type of a well-order which 
has a last element. 


Note that limit and successor ordinals must be nonzero, and that w is a successor 
ordinal if and only if ~@ = 6 + 1 for some ordinal 6. Examples of successor ordinals 
are 2, + 9, and m2 + 1, while w and w3 are limit ordinals. 


Problem 614. Let X be a well-order and let x € X. Prove that exactly one of the 
following must be true: 


¢ x is the first element of X 
¢ x is asuccessor element in X, i.e., x has an immediate predecessor in X 
¢ x is an upper limit point in X 


It follows that for every ordinal aw, either ~ = 0 or q@ is a successor ordinal or @ is a 
limit ordinal (the three possibilities are mutually exclusive). 


Problem 615. An order X is a well-order without any (upper) limit point if and 
only if Pred(x) is finite for all x € X. 
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An order of type w (such as the set N of positive integers with the usual ordering) 
is anonempty well-order without a last element and without any upper limit point, 
and is characterized (up to order type) by these properties: 


Problem 616. Let X be a nonempty well-order without a last element and without 
any upper limit point. Prove that the order type of X must be w. 


[Hint: Since every element of X must have an immediate successor, there is a 
function g: X¥ — X such that g(x) is the immediate successor of x in X. Hence 
by the principle of recursive definition there is f:N — X such that f(1) = the 
least element of X and f(n + 1) = g(f(n)) for all n € N. Show that f maps N 
onto X and that f is an order isomorphism of the positive integers (under the usual 
ordering) with X.] 


Problem 617. Prove that any infinite well-order not containing any limit point must 
be of type w. 
Thus is the unique limit ordinal which cannot be expressed as a + f where a is 


limit and f is nonzero. 
One can also rearrange the elements of N to get other ordinals. Consider 


1,2,3,4,...,n,n+1, ... (order type w) 
2,3,4,5,...,n,n+1,... 1 (order type w + 1) 
3,4,5,6,...,n,n+l1,... 1,2 (order type w + 2) 


The first order has no last element while the other two have last elements, and the last 
element of the second order, 1, is an upper limit element while the last element of 
the third order above, 2, is a successor element. Since order isomorphisms preserve 
all structural properties, so no two of the three orders above are isomorphic and 
hence the ordinals w, w + 1, and wm + 2 are all distinct. 

Suppose we put all the odd positive integers before all the even ones but 
otherwise order them following their usual order. This can be formally defined as 
an ordering (N, <,) where m <, n if and only if m is odd and n is even, or m andn 
have the same parity and m <n. This ordering (N, <,) is exhibited as 


1<,3<,5<,7 <o oy 2 <9 4 <, 6 <, 8 <, Vices 


and has order type @ + w = @?2. 
Similarly, we see that the following are all ordinals (where n is any finite ordinal): 


0,1,2,...,n,... @o, a@+1,04+2,...,04+n,... O+0=02, 
w2+1, o24+2,... @2+@=03, 034+1, 0342, ... o8+0=04, 


Problem 618. Prove that ifa + B = wand B € 0 then B = o. Prove that if 
a+ B = and B £ 0 then B = a”. 
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Problem 619. Let a be an order type of an ordering with a first element (so that 
a = 14 8 for some order type B). Prove that (1 + A)a is a Dedekind complete 
order type if and only if a is an ordinal. 


9.2 Limit Points and Transfinite Induction 


If P is a property, then we use the predicate notation “P(a)” to indicate that “the 
property P is true of the element a.” Recall: 


The principle of finite induction. Let A be an ordering which is either finite or is 
of order type w. Let ao be the first element of A. Suppose that P is any property 
satisfying (for all a,b € A): 

¢ P(ao) is true; 

¢ If P(a) is true and b is an immediate successor of a then P(b) is true. 


Then P(q) is true for alla € A. 


We will show that a generalized version of the principle of finite induction, called 
the principle of transfinite induction, holds for all well-orders. But first let us note 
that the principle of finite induction, as stated above, does not hold for general well- 
orders (other than orders which are finite or of type w). 


Example 620. Consider again the ordering <, on N of order type w + @ = @2, 
where all the odd natural numbers come before all the even ones: 


1 <, 3 <9 5 <9 7 <g +++ 2<9 4 <9 6 <9 8 <o *: 


Note that in this ordering 3 is an immediate successor of 1, and 4 is an immediate 
successor of 2, but 2 is not the immediate successor of any element. In fact for 
m,n € N,7n is an immediate successor of m in the ordering <, ifn = m +2. 
Let P be the property of being an odd positive integer. Then P(a) is true for the 
first element of (N, <,), namely 1. Also if P(q) is true (“a is odd”) and 6 is an 
immediate successor of a (“b = a + 2”) then P(b) is true (“b is odd”). Hence both 
conditions of the principle of finite induction are true for this ordering (N, <,). Yet 
it is false that P(a) holds for all a. 


The reason for this failure is easily found. The principle of finite induction holds for 
only those orderings in which every element can be reached starting from the first 
element by a “finite number of individual steps of moving to the next immediate 
successor.” And the only orderings in which this last condition is satisfied are the 
ones which are finite or of type w. The ordering (N, <,), which is of type w + a, 
contains the element 2 which cannot be reached from the first element by a finite 
number of steps of moving to the immediate successor. In fact 2 is an upper limit 
point in the ordering (N, <j). 
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Recall that for any point x in a well-order X there are three mutually exclusive 
and exhaustive possibilities: 


e x is the first element of X; 

¢ x 1s asuccessor element (x is the immediate successor of some element); 

¢ x is a limit element (x is an upper limit point in X; a well-order cannot have a 
lower limit point). 


The principle of finite induction will hold in a well-order so long as the third 
possibility above (existence of upper limit point) does not arise. 

In the above example, the property P of being an odd positive integer was indeed 
true for all numbers preceding 2 in the ordering (N, <,), but there was nothing to 
“induce” the property P to the limit element 2. For that, we will need a condition 
by which whenever a property holds for certain points, it can be “transferred” or 
“induced” to hold for any upper limit point of those points. Once we enhance the 
principle of induction by adding such a clause, it will apply to all well-orderings: 


The principle of transfinite induction. Let A be any well-order with first element 
ao, and P be any property which satisfies (for all a, 5): 

¢ P(do) is true; 

¢ If P(a) is true and b is an immediate successor of a then P(b) is true; 

¢ Ifa is an upper limit point of the set {x € A| P(x) is true} then P(a) is true. 


Then P(q) is true for alla € A. 


The proof is straightforward: To get a contradiction, let there be a € A for which 
P(aq) is not true, and let a be the least such element. Then a can neither be the first 
element do, nor can it be an immediate successor of some other element, and nor 
can it be an upper limit point, which is a contradiction since these possibilities are 
exhaustive in a well-order. 


One can restate the principle of transfinite induction in terms of sets (instead of 
“properties’’) as follows: 


The principle of transfinite induction (set version). Let A be a well-order with 
first element do, and let P C A such that for all a,b: 

e age P; 

¢ Ifa é P and bd is an immediate successor of a then b € P; 

¢ Ifa is an upper limit point of P thena € P. 


Then P = A. 


It is possible to combine the three clauses of transfinite induction into a single 
“strong induction” clause in which we can avoid mentioning “limit point” or “first 
point.” The advantage of such a form is that it covers both finite and transfinite 
induction via the concise statement “Every strongly inductive subset of a well-order 
equals the entire order”: 
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The principle of strong induction (finite and transfinite). Let A be any well-order, 
and let P C A such that for any a € A: 


Pred(a) C P > ae P. 


Then P = A. 


Strong induction actually characterizes the property of being well-ordered. 


Problem 621. Let X be an order. A subset P of X is called strongly inductive if 
Pred(a) C P > a € P (foralla € X), and the order X is said to satisfy strong 
induction if every strongly inductive subset of X equals X. Show that X satisfies 
strong induction if and only if it is a well-order. 


We next give a version of transfinite induction where an assertion can be 
established for all well-orders. In this case, P should be taken to be a property 
of orders, such as being a complete order. 


Transfinite induction for all well-orders. Let P be a property of orders such that 
if every proper initial segment of any well-order X has property P, then X has 
property P. Then all well-orders have property P. 


The proof is again routine: Assume the condition of the theorem but suppose that 
there is a well-order A which does not have property P. Then some proper initial 
segment of A will fail to have property P, so we can fix the least a € A such that 
Pred(a) does not have property P. But then every proper initial segment of Pred(a) 
has property P (by minimality of a) while A does not have property P, contrary to 
the condition of the theorem. 


The following is the corresponding principle in terms of properties of ordinals. 


The principle of transfinite induction for all ordinals. Let P be a property of 
ordinals such that for any ordinal a, if every ordinal less than a has property P, 
then a has it too. Then all ordinals have property P. 


The Principle of Recursive Definition in Sect. 2.10 (as in Theorem 148) also 
generalizes to well-orders, where it is called the principle of transfinite recursion. It 
is a very useful principle in practice and is often used to define functions on well- 
orders (or on initial segments of ordinals). 


Theorem 622 (Transfinite Recursion). Let A be any well-order, Y be a 
nonempty set, F be the collection of all functions whose domain is an initial 
segment of A and whose range is contained in Y, and GG: Ax F — Y. Then there 
is a unique function F: A — Y satisfying, for all a € A, the recursion condition: 


F(a) = G(a, F } Pred(a)). 
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Proof. The proof is similar to the proof of the Basic Principle of Recursive 
Definition (Theorem 146 in Sect. 2.10). 

Let us say that a function h is partially G-recursive if dom(h) is an initial 
segment of A, ran(h) C Y, and h(a) = G(a, h |} Pred(a)) for all a € dom(h). 
If h is partially G-recursive then so is h } J if I is any initial segment of dom(/). 

We first have the following uniqueness property: If h, h’ are partially G-recursive 
functions with a € dom(A)Ndom(h’) then h(a) = h'(a). To prove this by transfinite 
induction, assume that h(x) = h’(x) for all x € Pred(a). Then h(a) = G(a, h} 
Pred(a)) = G(a, h’}Pred(a)) = h'(a). 

Next, define a relation F C A x Y by the condition x Fy if and only if there is a 
partially G-recursive h with x € dom(/) and h(x) = y. The theorem will be proved 
if we show that F is a function, dom(F') = A, and F is partially G-recursive. 
Uniqueness of F follows from the uniqueness property that we just proved. 

Assume xFy and xFy’. Then y = A(x) and y’ = h’(x) for some partially 
G-recursive h and h’, hence by the uniqueness property that we proved, h(x) = 
h’(x) and so y = y’. Thus F is a function. It is also easy to see that F must be 
partially G-recursive. Finally, we show that dom(F’) = A by transfinite induction. 
Suppose that a € A and x € dom(F) for all x € Pred(a). Then the function 
h := F }Pred(a) is partially G-recursive. Put b = G(a,h), and extend h to h := 
h U {(a,b)}. Then h is a partially G-recursive function with a € dom(h), hence 
a €dom(F). Oo 


9.3 Well-Orders and Ordinals: Basic Facts 


Recall that if A is an initial segment in a well-order X with A ~# X then A = 
Pred(a) for some (unique) a € X. 


Theorem 623. Let X be a well-order and f:X — X be an order preserving 
(strictly increasing) injection. Then x < f(x) forall x € X. 


Proof. Otherwise, there would be a least a such that f(a) < a, but then b = f(a) 
is a still smaller element for which f(b) < b, a contradiction. Oo 


A function f such as above need not be onto. For example the mapping n +> n? is 


a strictly increasing mapping of N into N. However, we have: 
Corollary 624. The only order-preserving isomorphism of a well-order onto itself 
is the identity mapping. 


Proof. If f: X — X is an order isomorphism of the well-order X onto itself, let 
g:X — X be the inverse of f so that f(g(x)) = x for all x € X. Then for any 
x € X we have x < f(x) and also x < g(x), so f(x) < f(g(x)) =x. Oo 


Corollary 625 (Uniqueness of Isomorphisms). /f A and B are isomorphic well- 
orders, then there is a unique isomorphism from A onto B. 
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Another important immediate corollary of the theorem is: 


Corollary 626. A well-order is never order isomorphic to any of its proper initial 
segments. 


Problem 627. Let A be a subset of a well-order X which is strictly bounded above, 
that is, there is b € X witha < b foralla € A. Show that the suborder A cannot 
be isomorphic to X. 


The above facts limit isomorphisms between initial segments of a well-order: 


Corollary 628 (“Initial Rigidity of Well-Orders”). Jf A and B are initial seg- 
ments of a well-order X and f: A — B is an order isomorphism from A onto B, 
then A = B and f is the identity map on A = B. 


In any order X, we define the rank of an element a € X to be the order type of 
Pred(a). If X is not well-ordered, two different elements may have the same rank. 
For example, in any order of type ¢, all elements have the same rank *w; moreover, 
in the set Z of integers with the usual ordering, if m and n are any two integers, then 
the mapping x +> x + — 7m is an order-automorphism of Z which sends m to n, 
so that m and n are structurally indistinguishable within the order. The same is true 
for orderings of type 7 or A. The situation for well-orders is strikingly different: 


Corollary 629 (Unique Ranks). Jn a well-order, distinct initial segments have 
distinct order types, i.e., distinct elements have distinct ranks. Hence each element 
in a well-order is uniquely determined by its rank. 


This fact is further exemplified in Theorem 633 below, which exhibits the natural 
one-to-one correspondence between the elements of a well-order and the set of 
ordinals representing the ranks of those elements. 

Initial rigidity allows a proper definition for comparing ordinals: 


Definition 630 (Ordering of Ordinals). Given ordinals w and 6 with representa- 
tive well-orders A and B, we define a < f if A is order-isomorphic to some proper 
initial segment of B. We write a < 6 fora < B ora = B. 


Corollary 631. The relation a < B defined on any set of ordinals is irreflexive and 
transitive (hence asymmetric). 


(This situation is again specific to well-orders. An attempt to extend this definition 
to all orderings will fail because asymmetry and irreflexivity will be violated, 
producing oddities such as *w < *w orn < n, and we would even get both 7 < n+1 
and n + 1 < n holding at the same time!) 

The definition immediately implies that if a and B are ordinals with 
corresponding representative well-orders A and B, then a < B if and only if 
there is a unique order isomorphism from A onto a unique initial segment of B. The 
trichotomy property for < will be established in Theorem 636. 
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Given an ordinal @ and a representative well-order A with order type a, we let W(a) 
denote the set of order types of proper initial segments of A. Note that this definition 
of W(a) is independent of the choice of the representative well-order A. Moreover, 
for every ordinal 8, we have 8 < a if and only if f is the order type of some proper 
initial segment of A. So we can define: 


Definition 632. Given any ordinal a, let W(a) := {B| 6 is an ordinal < a}. 
Thus we have: 

W(0) = @, 

W(1) = {0} = W(0) U {0}, 

W(2) = {0,1} = WQ) U {1}, 

W(3) = {0,1,2} = W(2) U {2}, 


W(n) = {0,1,2,...,n — 1}, 
W(n + 1) = {0,1,2,...,2} = W(n) U {n}, 


W(q@) = {0,1,2,...,n, ...}, 
W(@ + 1) = {0,1,2,...,,...,@} = Wo) U {a} 


etc. 


Under the relation < on ordinals, the set W(a) of ordinals less than @ is itself a 
well-order of order type a: 


Theorem 633 (Representation Theorem for Well-Orders). Given a well-order 
A with ordinal a, there is a unique order isomorphism between A and W(a): A 
strictly increasing bijection f: A — W(a) defined by f(a) = the rank of a in A = 
the order type of Pred(a). 

The inverse of this bijection gives a strictly increasing complete enumeration of 
A indexed by the ordinals less than a: 


A={a <4 @ <4 G2 <4 +t <4 ag <4 +++} (E <a), 


with dy <4 @y for all 4’ < v < a, where ag is the unique element in A having 
rank &. 
In particular, W(a) is well-ordered by < and its order type (ordinal) is a. 


Proof. This is an immediate consequence of the unique rank property (Corol- 
lary 629) and the definition of w < for ordinals. Oo 
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From the last statement in the theorem it follows that if W(aw) = W(B) thena = B. 
Also, if & < a, then W(E) is a proper initial segment in W(a) with order type € so 
that W(é) is the set of predecessors of the element € € W(q) with & itself having 
rank € in W(a); and conversely, by uniqueness of ranks, a proper initial segment in 
W(q) having order type € must equal W(é). 


Corollary 634. For all ordinals a, B we have: 


1. W(a) = W(B) if and only ifa = B. 

2. An initial segment of W(a) having order type & must equal W(é). 

3. A is an initial segment of W(a) if and only A = W(&) for some — < a. 
4. W(a) €& W(B) ifand only ifa < B. 


Example 635. Recall the ordering <, on N having order type wm + w = w2, where 
all the odd natural numbers come before all the even ones, as in: 


1<,3<g5 <9 7 <9 +7: 2<9 4 <p, 6 <9 8 <g -:- 
The set W(w + w) = W(w2) is 
W(w2) := {0,1,2,...,0,...,@,0o+1@+2,...,@+n,...} 


The natural correspondence between (N, <,) and the ordinals in W(@2) is then 
seen as: 


1 573) a Ss ee T KG 2 sip Ae Ss Ge Ey. BRS, 
¢ t t ¢ ¢ ¢ ¢ t 
0< 1< 2< 3 < +: @w < wtl < w+2 < w04+3 < 


Theorem 636 (Well-Ordering and Ordinal Comparability Theorem). Given 
ordinals « and B, exactly one of a < B, B < a, anda = B holds (trichotomy). 
Hence if A and B are well-orders, either A is isomorphic to an initial segment of B 
or B is isomorphic to an initial segment of A. 


Proof. Put C = W(a) MN W(f), so that C is an initial part of W(a) and also of 
W(B). Hence C = W(E) (where € = order type of C), with both € < a andé < B. 
But we cannot have § < a and & < #, as otherwise we would get & €e C = W(&) 
(contradicting irreflexivity of <). Hence either € = @ in which case a < f, or 
& = B, in which case B < a. oO 


Thus if A is any set of ordinals, then A must be linearly ordered. In fact, A must be 
well-ordered, since otherwise we would have a nonempty B C A without a least 
element, and then for any a € B the set B MN W(a) would be a nonempty subset of 
W(q) without a least element, a contradiction. 


Corollary 637. Any set of ordinals is linearly ordered and in fact well-ordered. 
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Definition 638 (Initial Sets of Ordinals). A set A of ordinals is called an initial 
set of ordinals ifa € AandB <a> BEA. 


For every ordinal a, W(q) is an initial set of ordinals having order type a. In fact, 
these are the only initial sets of ordinals: 


Corollary 639. Every initial set A of ordinals equals W(a), where a is the order 
type of A. 


This follows from comparability: If A is an initial set of ordinals with order type a, 
we cannot have a € A (since otherwise W(q) would be a proper initial segment of 
A of order type a), and so A C W(a) by comparability; hence A = W(f) for some 
B, which implies that 8 = order type of W(B) = order type of A = a. 


Well-Ordered Sum of Ordinals 


Recall that we needed the Axiom of Choice to define arbitrary sums of order types 
(Definition 473) of the form 


> Qj (J an order, a; an order type for eachi € /). 

ie] 
AC was needed twice: First for choosing representative orders of type a; for each 
i € TI (existence), and then again for choosing isomorphisms between multiple 
representatives for each type when showing that the order type of the sum does not 
depend on the choice of representatives (uniqueness). 

A nice consequence of the unique representation property for well-orders is that 
if each a; is an ordinal (i € J), then the above sum can be defined in an effective 
canonical fashion without using AC at all: For the existence part, we can simply 
let W(a;) be the canonical representative well-order of type a; (for eachi € J). 
The uniqueness part follows immediately from the uniqueness of isomorphisms for 
well-orders. 


Theorem 640 (Arbitrary Sum of Ordinals without AC). Jf J is any order and a; 
is an ordinal for eachi € I then the sum 


di 
ie! 
is defined and unique even if we do not assume the Axiom of Choice. 


Proof. Uniqueness follows from the fact that if X; and X/ are representative well- 
orders of type a; then there is a unique order isomorphism between X; and X/. For 
existence, take W(a;) as the representative well-order for a; and order ),., {i} x 
W(q;) lexicographically (by “first difference”). oO 
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We will be interested in the case where J is a well-order, when the sum itself 
becomes an ordinal. 


Lemma 641. Suppose that X is an order with X = Uje,S; such that each §; is 
well-ordered as a suborder of X (fori € I),i < j > S; < S; (foralli,j € J), 
and I is itself a well-ordered set. Then X is a well-order. 


Proof. Let X, I, and (S;| i € I) be as above, and let E be a nonempty subset of 
X = Uje, S;. Let J = {i € 1 | S; A E 4 O}. Then J is nonempty and since I 
is a well-order, J contains a smallest member ip € J. Then S;, M E is a nonempty 
subset of S;,, and since S;, is a well-ordered subset of X, S;, E contains a smallest 
element p € S;, M E. Itis then easily verified that p is the leastelementof FE. O 
Corollary 642 (Well-ordered sum of ordinals is an ordinal). /f I is well-ordered 
and a; is an ordinal for eachi € I then ees a; is an ordinal 


The product af of two ordinals a and B is conveniently viewed as “a repeated 6 
times,” or equivalently as “6 copies of a.” For example, we have: 


lo=14+14+14+--=) 1=a0, 2m =242424+--=) 2=0, etc., 


n<@ n<@ 


while 


wo =00 =) o=0+0+04+-° 


n<wW 


We can keep going further using repeated sum. For example, after #7 = w*0@ = 
Yieg ©, and wt = wo = >, _,, 0°, etc., we can get the following larger ordinal 
which will later be denoted by w®: 


Yio" =1l+at+a?+or+e-to"+e+ 


n<wW 


Problem 643. J. Simplify the sum )° nm =14+24+3+---+n+--:. 
2. Simplify the sum °,_,, On = ©+@2+03+---+an+--- asa single ordinal. 
3. Find a re-ordering of N having the order type of the previous part. 


9.5 Successor, Supremum, and Limit 


Given any ordinal a, note that W(a) U {@} is an initial set of ordinals whose greatest 
element is a, and we define S(a), the successor of a, by 
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S(qa) := the order type of W(a) U {a} 
the unique £ such that W(B) = W(a) U {a}. 


II 


The ordinal S(q) is the least ordinal greater than @ and is same as a + 1, but the 
above definition is independent of the notion of sum of ordinals. 
If E is any set of ordinals, put: 


Pred[E] := 'e W(B) = {a| a < B for some f € E}. 
BEE 


Problem 644. For any set E of ordinals, Pred[E] is an initial set of ordinals and 
therefore equals W(y) for a unique ordinal y. The ordinal y is the least upper bound 
of E, that is we have a < y for alla € E and there isno y' < y such thata < y’ 
foralla € E. 


Definition 645. For any set E of ordinals, put 


sup E = supa@ := the unique ordinal y such that Pred[E] = W(y). 


ace 


Problem 646. Show that for any set E of ordinals 


1. If E = @, then sup E = 0; 

2. If E has a largest element a, then sup E = a; 

3. If E is nonempty and has no largest element, then sup E is the unique limit 
ordinal y such thata < y for alla € E and such that for all B < y there is 
aeEwithB<a<y. 


In the last case above, sup F is called the limit of the elements of E, and denoted by 


lim £ = lima := sup E. 
ace 


Problem 647. For any set E of ordinals, show that 


1. FE’ := EU 'o W(q) is the smallest initial set of ordinals containing E. 
ack 
2. The order type of E' equals sup S(a) = sup{S(a)| a € E}. 
acE 


3. sup S(q) is the least ordinal greater than all elements of E. 
acE 


In particular, for any set E of ordinals there is an ordinal greater than all elements 
of E, with sup,<z S(@) being the least such ordinal. 


Problem 648. For any set E of ordinals, show that 


in S(O y+1 if £ has a largest element y, 
acE sup E otherwise. 
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Problem 649. Given a set C of well-orders, effectively construct a well-order 
whose order type is the supremum of the order types of the well-orders in C. 


Theorem 650 (Transfinite Recursion over all Ordinals). Let G be a function 
which assigns an object G(h) to every function h whose domain is an initial set of 
ordinals (i.e., with dom(h) = W(a) for some ordinal a). Then there exists a unique 
function F defined for all ordinals such that: 


F(a) = G(F }W(@)), for every ordinal a. 


Proof. For each ordinal a, apply Theorem 622, with the well-order A there replaced 
by W(a + 1), to get a unique function Fy with domain W(a@ + 1) and satisfying 
the recursion condition Fy(B) = G(Fy | W(B)) for all 6B € W(@ + 1). Define 
F(a) := Fy(a). Note that if a < 6 then Fg extends F, (by uniqueness). Hence for 
every ordinal a, F extends F,, and therefore F(a) = Fy(a) = G(Fy }W(a)) = 
G(F |} W(a@)). Oo 


9.6 Operations Defined by Transfinite Recursion 


For given ordinals w and f, one can use transfinite recursion to define the ordinal 
sum a + B as the B-th successor of a, i.e., @ + B is obtained starting from a by 
repeatedly applying the successor operation f times. Here the recursion is done on 
the second argument £, which means a + f is defined assuming that a + y has been 
defined for all y < 6. Breaking up into three cases, we have: 


a if B = 0; 
at p= S(a+y) if B = S(y) is the successor of y; 
supa+y_ if B isa limit ordinal. 
v<B 
Here the first argument w can be regarded as a parameter. 


Problem 651. Show that the above informal definition by transfinite recursion can 
be cast into the more formal framework of Theorem 650 by first fixing a and taking 
Gy to be: 


a if h is empty; 


Gy(h) := 
*) sup{S(&) | €& € ran(h) and & is an ordinal} otherwise. 


That is, for any ordinal a, if we use this Gy in Theorem 650 to obtain Fy with 
F,(B) = G(Fy |} W(B)) for all B, then F,(B) = a + B for all ordinals B. 
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From now on, however, we will simply use the informal version of definition by 
transfinite recursion, without explicitly displaying the function G needed by the 
formal setup. 


Problem 652. Show that a + B = a + B for all ordinals a, B, that is the ordinal 
sum as defined above by transfinite recursion coincides with the usual sum of order 


types. 


In a similar way, we can use transfinite recursion to define the ordinal product a- B: 


0 if B = 0; 

a-B= (a-y)+a_ if B = S(y) is the successor of y; 
supa-y if 6 is a limit ordinal. 
y<B 


Problem 653. Show that a-B = af for all ordinals a, B, that is the ordinal product 
as defined above by transfinite recursion coincides with the usual product of order 


types. 


As aresult, all rules valid for sums and products of order types will apply to ordinals. 
In particular, the associative law and the left distributive law hold. 

In the definitions above for ordinal sum and product, the limit ordinal clauses say 
that the each of these operations is “continuous” in the second variable, that is, if 6 
is a limit ordinal so that 6 = lim, <g y, then: 


lim(a+y)=a+limy, and lim(ay) =alimy. 
v<B y<B y<B y<B 


Problem 654. Show that neither the ordinal sum operation nor the ordinal product 
operation is continuous in the first variable. 


Problem 655. J. @ is the smallest limit ordinal. 
2. If a is an ordinal then a + @ is the smallest limit ordinal greater than a. 
3. a is a limit ordinal if and only if « = wB for some ordinal B > 0. 


Problem 656. [f A is a well-order of type a and B is the order type of a suborder 
of A, then B <a. 


Problem 657. [fa < B theny +a < y + B, and conversely. This gives left- 
cancellation for addition: Ify+a = y+B thena = B. Ifa < Bthenat+y < B+y. 

For y > 0, ifa < B then ya < yB and conversely. This gives left-cancellation 
for products: If ya = yB and y > 0 thena = B. Ifa < B thenay < By. 


Subtraction. If a < B then there is a unique y such that wa + y = B. This y is 
denoted by —a + 6 and is a form of (one-sided) subtraction for ordinals, so that 


B=a+(-a+ B), whenever a < B. 
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This follows immediately from the fact that given two well-orders, one of them is 
isomorphic to a unique initial segment of the other. It is easy to see that we have 
—a + (a+ $) = 6 for all ordinals a and 6. Note also that if a < 6, then —aw + B 
is the order type of W(B)~W(q). 


Theorem 658 (Division Algorithm). Jf a, 8 are ordinals with a > 0 then there 
are unique ordinals n and & such that 


B=an+é, withE<a. 


Proof. Note that with v = 6B + 1 we have av > Bf. Let y be the least ordinal such 
that au > B, so that uw < 6 + 1. Then yx must be a successor ordinal since if ju 
were a limit ordinal, then 6 would be an upper bound of E := {av| v < w}, which 
would imply that a = sup EF (= least upper bound of E) < f, which contradicts 
ap > B. So we can write 4 = n+ 1 for some yn. Then ay < #, and we can put 
€ = —an + f, giving B = an + &. And we must have € < a, for otherwise we 
would get § = a + y for some y, giving B =antat+y >a(n+1) = ap, 
contradicting B < ay. 
For uniqueness, suppose that 


an+&§&=an'+& with&,&' <a. 


Then 7 > 7’, for otherwise 7’ = n+ withf > 0,soan+& =an' + = 
an + af + &’, and left-cancellation would give & = af + & > a, contradicting 
€ <a. Similarly n' > n, and so n = n’. Hencean + & = an+ &’, andso& = &’ by 
left-cancellation. Oo 


Problem 659 (Even and Odd Ordinals). Call an ordinal a even if it can be 
expressed in the form a = 2y and call it odd ifa = 2y + 1 (for some ordinal 
y). Show that every ordinal is either even or odd but not both. Show that every limit 
ordinal is even. 


9.7 Remainder Ordinals and Ordinal Exponentiation 


We say that an ordinal B > 0 is a remainder of an ordinal y if y = a + B for 
some ordinal a. Thus the finite ordinal 3 has as remainders 1, 2, and 3, and 0 has 
no remainder. In general the finite ordinal n < w has exactly n remainders, namely 
1,2,...,”. The only remainder of the ordinal @ is @ itself. 


Problem 660. An ordinal can have at most finitely many remainders. 


{Hint: Note that the remainders of y are given by the ordinals of the form —a + y 
fora < y, and the mapping w +> (—a@ + y) is monotonically decreasing, that is, if 
a <a’ then—a+y>-a'’+y.] 
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Definition 661. An ordinal p > 0 is said to be a remainder ordinal if the only 
remainder of p is p¢ itself, that is, if whenever p = a + f with 6B > 0, we have 


B=p. 


In other words, p is a remainder ordinal if it is the type of a nonempty well-order A 
in which every nonempty terminal segment is isomorphic to the entire order A. It is 
also easily seen that p is a remainder ordinal if and only if wa + p = p for alla < p. 

We have seen that | and w are remainder ordinals. Other than 1, all remainder 
ordinals must be limit ordinals. 


Problem 662. [fa is a remainder ordinal, then so are aw and wa, and aw is the 
smallest remainder ordinal greater than a. 


Thus after 1 and w, the next remainder ordinal is w”, the following one is w°, and 
we get the sequence 


1,@, 7, w°,...,0",... 


of the first w remainder ordinals. Writing 1 = w°, this sequence consisting of the 
first w remainder ordinals can be expressed as (w” | n < w). 


Problem 663. If E is any nonempty set of remainder ordinals without a largest 
element, then sup E is a remainder ordinal, and so is the least remainder ordinal 
greater than all the ones in E. 


Thus we have the ordinal 


sup @” = sup{l, w, 7, w°, ..., w",...} 
n<w 


as the least remainder ordinal greater than all w”, n < w. This ordinal is denoted 
by w®. 

More generally, for any two ordinals @ and 6, we can define the ordinal 
exponentiation a? by transfinite recursion on p: 


Definition 664 (Ordinal Exponentiation). 


1 if B = 0; 
qe -= J(a”)a if B = S(y) is a successor ordinal; 
supa’ — if 6 is a limit ordinal. 


y<B 


Note that the above definition does not create a conflict with our previous usage 
of the notation w” as an abbreviation for wa ---a@ (n factors). From the last clause 
of the definition we have “continuity in the exponent,” i.e., ordinal exponentiation 
a, B +> @ is continuous in the second variable p: 


rs a’ =a, where f is any limit ordinal. 
y< 
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Ordinal exponentiation should be carefully distinguished from cardinal 
exponentiation. For example, for cardinals we had 


2% > No, 
while with ordinal exponentiation we have: 


2° = sup 2” =o. 
n<w 


(For finite ordinals and cardinals, the two notions coincide.) 
Using transfinite induction, we can establish the main properties of ordinal 
exponentiation: 


Problem 665 (Properties of Ordinal Exponentiation). 


1. 1% = 1, and 0% = 0 fora > 0. 

2. aba’ = ahty, 

3. (aP)\Y = a, 

4. Fora > 1,08 > BandB <y > a8 <a’. 


The largest ordinal that we have seen so far is @® = sup, —,, @”, but we can proceed 
further as: 


+1 wo 


wo +1<0® +0 <0 +07 <0 +0° = 02 < 0° = 0°! <o 


on to 


2 3 * < ‘ o 1 
wo <@® <---, and in the limit: @w° = supw®. 


In fact, using exponentiation we get: 


<-++, and ¢9:=sup{w,w® ,w® ,...} 


The ordinal ¢o has the property w*? = ¢&. Ordinals a which satisfy the equation 
@* = a are called epsilon numbers (Cantor). 


Problem 666. Show that & is the smallest epsilon number, and that for every 
ordinal there is a greater ordinal which is an epsilon number. 


The next epsilon number after ¢9 is called ¢;, the next epsilon number after ¢; is 
called €2, and so on. 

An ordinal is said to be a countable ordinal if it is the order type of some 
countable well-order (i.e., a well-order defined on some countable set). Since ordinal 
sum and product coincides with the ordinary sum and product of order types, we see 
that sums and products of countable ordinals are countable ordinals. The countable 
ordinals are also closed under forming “countable limits”: 
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Problem 667 (CAC). Jf E is a countable set of countable ordinals without a 
largest element, then their limit lim E = sup E is also a countable ordinal. 


From these facts it follows by transfinite induction that if w and 6 are countable 
ordinals then so is a. 

The countable ordinals thus have quite strong closure properties, and all the 
ordinals above including w”, €9, €;, and so on, are countable ordinals. 


Definition 668 (Sum-Closed and Product-Closed Ordinals). An ordinal & is 
called sum-closed ifa,B <§& > a+ < &, and & is called product-closed if 
a,pB<&>aPp <&. 

Problem 669 (Characterization of Remainder Ordinals). Let p be a nonzero 
ordinal. Then the following conditions are equivalent. 


I. p is a remainder ordinal. 
2. p is a sum-closed ordinal. 
3. p = ow" for some ordinal a. 


Problem 670. An ordinal p > 2 is product-closed if and only if p = w® for some 
ordinal a. 


Definition 671 (Normal Functions). Suppose that F(a) is an ordinal for each 
ordinal w. We say that F is a normal function if F is increasing, i.c.,a < Bp > 
F(a) < F(B), and F is continuous, ic., F(#) = supp -, F(B) for every limit 
ordinal a. 


Normal functions are frequently encountered in the theory of ordinals. The sum, 
product, and power functions are normal in the second variable (i.e., when the first 
argument is held fixed), but not in the first variable. 


Problem 672. Show that a normal function F must have arbitrarily large fixed 
points, i.e., for each ordinal a there is an ordinal B > a with F(B) = B. 


We can generalize the notion of iterated derived sets in orderings using ordinals as 
follows. Let X be an order and A be a subset. Recall that D(A) denotes the set of 
limit points of A in X. 


Definition 673 (Cantor—Bendixson Derivative). Let X be an order and let A C 
X. For each ordinal a, define D(A), the a-th iterated derived set of A, by 
transfinite recursion as follows: 


D(A) := A, 
D&*)(A) := D(D™(A)), 


D(A) := () D(A) if y isa limit ordinal. 
B<y 
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Problem 674. Let E be any initial set of ordinals, considered as an order by itself. 
Show that for each ordinal a > 0, 


D©(E) ={y € E| y = w"£ for some B}. 
Conclude that if X := {y| y < @%} = W(w@*) U {w"}, then in the order X we have 
D(X) # @, but D&T) (X) = @. 


Problem 675 (Hausdorff’s Ordinal Exponentiation). Let A and B be well- 
orders with A nonempty and let a € A be the first element of A. Let E be the 
set of all functions f from B to A such that f(x) = a for all but finitely many 
x € B. Thus E © A®. For f.g € E, define f <x g if for some b € B we have 
FS (b) < g(b) with f(b’) = g(b’) for all b’ > b. 


1. Show that the relation <y linearly orders E. 
2. Show that E is well-ordered by <y with order type a, where a and B are the 
order types of A and B, respectively. 


9.8 The Canonical Order on Pairs of Ordinals 


Recall the lexicographic order on W(aw) x W(q), of order type a”. We will now 
define a different order <J on pairs of ordinals called the canonical order. 
We first partition W(a) x W(a) into a-many sets P;, § < a, where 


Pe = {(u.v)| max(u.v) =6}  (E <a”). 
Note that Pe = (W(E) x {E}) U ({E} x W(E)) U {(E,£)} for each & < a, and 


L) Pe = Wa) x Wa). 


E<a 


Thus the sets P: (& < @) are pairwise disjoint with union W(a) x W(q). 
Now each set P; gets well-ordered by the order it inherits from the lexicographic 
order on W(a) x W(a), under which its order type will be €2 + 1. 


(0,7) 


(0,€) 
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Finally, we define a new order on W(a) x W(a) by setting P; < P, whenever & < n, 
and following the inherited lexicographic order within each set P,. Put differently, 
this order is defined as follows. 


Definition 676 (Canonical Order <j on Pairs of Ordinals). The canonical order, 
denoted by <J, for pairs of ordinals is defined as: 


(U,V) <1 (y, 6) > max(pe, v) < max(y, 4), 
or max(j4, v) = max(y, 6) and yw < y, 
or max(j4, v) = max(y,6) and zw = y andv <6. 


Since each set Pz has order type 2 + 1, we immediately get: 


Corollary 677. Under the canonical order <, W(a) x W(a) has order type 


OrdTyp a(W(a) x W(a)) = } 12+ 1), 


E<a 


and hence the canonical order is a well-order. 


The canonical order on W(a) x W(a) has nicer properties than the lexicographic 
order on it. 


Problem 678. /. If B < a, then W(B)xW(f) is an initial segment of W(a) x W(a) 


under the canonical order. In fact, we have 


W(B) x W(B) = Preda((0, B)). 


2. Let B(a) := OrdTypa(W(@) x W(a)) = dog .q(B2 + 1). Then ® is a normal 


function. 


Problem 679. Show that both the properties in Problem 678 fail if the canonical 
order is replaced by the lexicographic order on W(a) x W(q). In fact: 


1. If1 < B <a@ then W(B) x W(B) is not an initial segment of W(a) x W(a) under 
the lexicographic order. 

2. If W(a) := a? = the order type of W(a) x W(a) under the lexicographic order, 
then W is not a normal function. 


Theorem 680. Let a > 2 be a product-closed ordinal. Then the canonical order 
on W(a) x W(a) has order type a, i.e., OrdTyp,(W(a) x W(a)) = a. Hence there 
is a unique bijection w:W(a) x W(a) — W(a) which preserves order: (jt, v) < 


(&,n) <> WU H,v)) < WUE, 0). 


Proof. Let a > 2 be product-closed, and let p := OrdTyp_(W(a) x W(a)) be the 
order type of W(a) x W(a) under the canonical order. The mapping & + (&, &) isa 
strictly increasing injection (order embedding) of W(a) into W(a) x W(a), hence 
ax<o. 
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Now let 6B < p be given. Then there is (u,v) € W(a) x W(a) such that 
Preda(({u,v)) has order type 6. Since @ is a limit ordinal we can fix 6 with 
max(jt,v) <6 <a. Then Preda((u, v)) C W(5) x W(8), so 


B < OrdTyp <(W(8) x W(8)) = S “(v2 + 1) < (62 + 18 < (63)5 < a, 
y<6b 


since @ is product-closed. Thus 6 < @ forall B < p,ie., op <a.Soa =p. oO 


9.9 The Cantor Normal Form 


We generalize the following familiar fact about natural numbers to ordinals: Given 
a base b > 1, every natural number 7 can be uniquely expressed as 


Pi> pr>+++> pr=oO 
0 <d\,do,...,d, <b 


n = db?! + dob? +++» + dyb?* | KEN. 


Theorem 681 (Expansion in Powers of a Base). Let B > 1 be a fixed “base” 
ordinal. Then every ordinal a > 0 can be expressed as the following “polynomial” 
in powers of B with nonzero coefficients less than B: 


>> > & 


O<m1,2,---,% < B 


a = Bn, + Beno +--+ BE NK | KEN. 


Proof. Let v be the least ordinal such that B” > a (sucha v must exist since B**! > 
a+1 > a). Then v must be a successor ordinal, since otherwise from 6” < a for all 
y < v we would get B” < a (by continuity in the exponent), contradicting BY > a. 
Hence v = C; + 1 for some ¢, satisfying B*' < a, with ¢, being the largest ordinal 
for which ' < qa. By division algorithm, 


a= Bn +&, &, < Bp". 


Then we get 0 < m < f, as B < m would imply Bt! = BOB < Bon, < a, 
contradicting the fact that ¢, is the largest ordinal for which B*' < a. 

If now &| = 0, we are done. Otherwise &; > 0 and we repeat the above procedure 
with a replaced by & to get 


& = B°m + &, f < B?, 


Here again we must have 0 < yn < B, and also ¢ < ¢;. We thus have: 


a = Bn, + B@ no + b, 61 >b, O<m.m < B. 
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Continuing in this fashion we see that the process must stop after a finite number 
of steps since otherwise we would get an infinite strictly decreasing sequence of 
ordinals €; >) >---. Oo 


Note that when @ and f are finite ordinals, the above theorem gives the usual 
representation of ordinary integers with respect to a base, e.g., with B = 10 we 
have decimal representation and with 6 = 2 we have binary representation. 

The case where the base f equals is particularly important: 


Corollary 682 (Cantor Normal Form). Every ordinal a > 0 can be expressed as 
a “polynomial in w with integral coefficients” : 


Gia by ee oe 


0 <nj1,N2,...,Nk <@ 


a=o''ny +@2n, +---+ ong KEN. 


Problem 683. Every ordinal a < w® can be uniquely expressed as the “polynomial 
in @ with integral exponents and coefficients”: 


a=o'n +o'™n +---+o™ ng, m, >M2>-+++> mk, 


with the exponents m; and coefficients n; > 0 all finite for j = 1,2,...,k. 


Remark. The entire basic theory of ordinals and well-orders as presented in this 
chapter was created by Cantor [6]. 


Chapter 10 
Alephs, Cofinality, and the Axiom of Choice 


Abstract This chapter concludes our development of cardinals and ordinals. 
We introduce the first uncountable ordinal, the alephs and their arithmetic, Har- 
togs’ construction, Zermelo’s well-ordering theorem, the comparability theorem 
for cardinals, cofinality and regular, singular, and inaccessible cardinals, and the 
Continuum Hypothesis. 


10.1 Countable Ordinals, w,, and 8; 


All orders we have encountered so far, including all uncountable orders such as R 
and all ordinals, have been of countable cofinality. (Recall that an order is said to 
have countable cofinality if it has a countable cofinal subset.) We will now define an 
ordinal of uncountable cofinality. 

Recall that an ordinal is a countable ordinal if it is the order type of a well-order 
defined on a countable set, or equivalently if it is the order type of a well-ordered 
rearrangement of a subset of N. By Cantor’s theorem on countable dense orders, 
every countable order is isomorphic to a suborder of Q, hence a countable ordinal 
could also be defined as an order type of a well-ordered suborder of Q. We thus 
have: 


Proposition 684 (Countable Ordinals). For any ordinal a, the following are 
equivalent. 


I. a is a countable ordinal. 

2. @ is the order type of a well-ordered rearrangement of a subset of N. 
3. a is the order type of a well-ordered suborder of Q. 

4. W(a) is countable. 


All ordinals that we have seen so far are countable ordinals, such as: 


0.1.2... WOH OHO O25 We  EQy eee Ely eee 
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As we will see now, the set of all countable ordinals turns out to be an uncountable 
well-order. In other words, the set of ordinals of well-ordered rearrangements 
of subsets of N is itself a well-order which is longer than any well-ordered 
rearrangement of a subset of N. 

Let C be the set of all countable ordinals. Then C is an initial set of ordinals 
(Definition 638), so we have C = W(q@ ), where @ is the order type of C. Hence 
a is a countable ordinal if and only if a < @. 


Definition 685 (Cantor). @, denotes the order type of the set of all countable 
ordinals, so that a € W(@,) < W(q) is countable. Equivalently, @, is the order 
type of the set of all ordinals of well-ordered suborders of Q. 


Since @, ¢ W(a ) and W(q@,) contains all countable ordinals, it follows that 
is not a countable ordinal, while every a < @, is countable. Hence a is the least 
uncountable ordinal. Note also that @; must be a limit ordinal, since the successor 
of a countable ordinal is a countable ordinal. W(@,) is thus an uncountable well- 
order without a greatest element, in which every proper initial segment is countable 
and every nonempty terminal segment is uncountable. These facts are summarized 
in the following: 


Theorem 686. «, is the smallest uncountable ordinal and is a limit ordinal. The 
set W(@,) consists of all countable ordinals, and is an uncountable well-order in 
which every proper initial segment is countable. 


Assuming the countable axiom of choice, the uncountable well-order W(q@ ) also 
has uncountable cofinality, since limits of countable sequences of countable ordinals 
are countable ordinals. In other words, if we had a countable cofinal subset E of 
W(@,), then W(@,) = Uvex W(a) would be countable (being a countable union of 
countable sets, a fact which uses the countable axiom of choice), contradicting the 
uncountability of W(@,). This means the limit ordinal w,; cannot be expressed as a 
sequential limit of smaller ordinals: 


If a, < a foralln <q, then lima, < a. 
n<@ 


Hence every cofinal (unbounded) subset of W(@,) is uncountable and so has order 
type @,. Conversely, any uncountable subset of W(w,) is cofinal. Thus: 


Theorem 687 (CAC). The well-order W(@,) has uncountable cofinality. A subset 
of W(a) is cofinal (unbounded) if and only if it is uncountable if and only if it has 
order type a. 


We say that a set F of ordinals is closed under internal repeated additions if 
whenever y € E and By € E for eacha < y, then es Ba € E. We say that 
E is closed under internal sups if whenever y € E and By € E for eacha < y, 
then supy., Ba € E. 
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Problem 688. Show that the set W(w) = {0,1,2,...} = {n | n < o} of all 
finite ordinals is closed under addition, multiplication, exponentiation, and internal 
repeated additions and internal sups. 


Problem 689 (CAC). 


1. Show that W(w*) = {om +n| m,n < a}. 

2. Show that the set W(w?) = {om +n| m,n < @} is the smallest set of ordinals 
containing 0, 1, and w, and closed under addition. 

3. Find the smallest set of ordinals containing 0, 1, and w, and closed under both 
addition and multiplication. 

4, Find the smallest set of ordinals containing 0, 1, and w, and closed under 
addition, multiplication, and ordinal exponentiation. 

5. Show that if a set of ordinals containing 0, 1, and w is closed under addition and 
under taking internal sups, then it is closed under internal repeated additions, 
under multiplication, and under exponentiation. Find the smallest set of ordinals 
containing 0, 1, and w, and closed under addition and taking internal sups. 

6. Show that if a set of ordinals containing 0, 1, and w is closed under internal 
repeated additions, then it is closed under addition, multiplication, and exponen- 
tiation. Find the smallest set of ordinals containing 0, 1, and w, and closed under 
internal repeated additions. 

7. Show that a set of ordinals which contains 0 and is closed under taking 
successors and countable limits must contain all countable ordinals. 


Problem 690. Suppose that a, > 0 is a nonzero countable ordinal for eachv < a. 
Show that >> Gy = a. Are you using the CAC? 


V<@] 


Earlier we saw that orders of type (1 + A + 1)‘A (k € N) are examples of non-CCC 
continuums. We now have a different continuum which is not CCC. 


Problem 691 (The Long Line). Let X be an order of type 4 + (1 + A), and let 
Y be an order of type (1 + A + 1)A. Show that] 


1. Both X and Y are non-CCC linear continuums without endpoints. 

2. In X every nonempty bounded open interval has order type 4 (and so is 
isomorphic to R), but this is not true in the order Y. 

3. X has no countable cofinal subset. 

4. None of the continuums X and Y can be embedded in the other. 


10.2 The Cardinal 8; 


The set W(q@,) of all countable ordinals has cardinality > No, and this cardinal 
number is denoted by &;. 


Definition 692 (Cantor). &; := |W(a)|. 
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Problem 693. A subset of W(@,) is either countable or has cardinality &. 
So the cardinal §; is the next larger cardinal after Xo in the following sense: 
Problem 694. Xo < &; and there is no cardinal k such that %) < kK < &. 
Problem 695. Prove that 


1. n+, = for any finite cardinal n. 

2. No +N, =). 

3. 8; +8, =&). (Hint: Use even and odd ordinals.] 
4. BN = &1. 


Theorem 696. 8} = &i. 
Proof. The ordinal w, is product-closed, i.e.,a,B < @, => a6 < a, (since if 
W(a), W(B) are countable then so is W(a) x W(B)). Hence by Theorem 680, the 


canonical order <| on W(w,) x W(@,) has order type @,, and so there is a bijection 
(order isomorphism) from W(@,) x W(q@ ) onto W(a@). Oo 


Note that the above is an effective proof of Ni = &, without any use of the Axiom 
of Choice. A variant proof is obtained using the following problem. 


Problem 697. Show that f: W(@,) x W(a@,) — W(a) defined by 


2a if B = a, 
f@,.B)= 42@?+A)+1 fp <a, 
2(p?+a)+2 ifB>a. 


is an injection from W(@,) x W(q@) into W(@)). 


One can then use the Cantor—Bernstein theorem to combine the above mapping f 
with the injection a — (a,0) from W(@,) to obtain an effective bijection between 
W(@,) x W(a,) and W(a@)). 


Using the above results, we get many more well-orders of cardinality &,. For 
example, orders of type @; + @, or @;2, or wf, all have cardinality &. Note that 
many of these orders (e.g., any order of type @; + w) are uncountable but have 
countable cofinal subsets. 


Closed Unbounded Subsets of W(@1) 


Recall that a subset A of an order is called closed if A contains all its limit points 
(Definition 592). In a well-order there are no lower limit points and so A is closed if 
A contains all its upper limit points, or equivalently if sup E € A for all nonempty 
bounded FE C A. 
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Definition 698 (Club Sets). A subset of W(@) is called a club set if is both closed 
and unbounded above. 


Proposition 699. The intersection of countably many club sets is a club set. 


Proof. Let A, Ao, ..., be club sets and let A = 1,,Ajz. It is easy to see that A is 
closed, since if x is any (upper) limit point of A then x is an upper limit point of A, 
for all n, and so (since A, is closed) x € A, for all n. To see that A is unbounded, 
fix any countable ordinal jz. Fix also a function g: N — N such that for any n ¢ N 
there are infinitely many m with g(m) = n, e.g., g may be taken to be the sequence 


1,1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5,... 
Since each A,, is unbounded, for any finite set F of countable ordinals there exists 


a € A, such thata > & forall & € F. Using this we can inductively choose ordinals 
a,, for each n, such that each a, € Ag) and is greater than all preceding elements: 


<a <a. <3 <->: with a, € Ag) for all n. 
Let a := sup, @,. Then @ is a limit point of A, for all, hence a € A, for all n, 
andso@ €M,An = A. oO 


Problem 700. Prove that if f:W(a@,) — R is continuous, then f must be 
eventually constant, i.e., there exists a < @, such that f(B) = f(a) forall B > a. 


[Hint: Put E[x] := {a | f(a) > x} and F[x] := {a | f(a) < x}, which 
are closed subsets of W(@:) for x € R (Problem 597). Put L := {x € R| 
E|x] is uncountable}, which is nonempty and bounded above since U,ezF[n] = 
W(a,) while NyezE[n] = @. Finally, put p := sup L, and show that both E[p + 1] 
and F[p — 1] must be countable for all n € N.] 
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The process by which we constructed @, and &; from w and No can be iterated 
further as follows. Consider the set of order types of well-orders defined on subsets 
of W(a). This is the set of ordinals of well-orders defined on sets of cardinality 
< &, and is an initial set of ordinals (Definition 638). Hence it equals W(@2) for an 
ordinal w2. Then @ cannot be the ordinal of a well-order of cardinality < &, (since 
@2 ¢ W(a@2)). Thus W(q@2) has cardinality > &,, and 2 is the type of a well-order 
on a set of cardinality > &, (and is the least such ordinal). We denote the cardinality 
of the set W(@2) by Xz, which gives us a cardinal greater than &;. We can continue 
this process to get w3 and &3, and so on. 
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The cardinalities of the sets W(q), for various ordinals a, start as: 
|WO)| = 0 < |W()| = 1 <-++ <|W@)| = Xo. 


After this, we have |W(a)| = No for all uncountably many a satisfying w < a < @, 
since the ordinals in W(@,)~ W(q@) are the types of infinite countable well-orders 
(this is called the second number class by Cantor). The next ordinal a for which we 
have the |W(a)| > |W(B)| forall B <aisa =a. 


Definition 701 (Initial Ordinals). An ordinal @ is an initial ordinal if the cardi- 
nality of W(q) is greater than that of W(f) for all B < a. 


Thus all finite ordinals and the ordinal w are initial ordinals. The next initial ordinal 
is @, the following one is w2, etc. Note also that if EF is a set of initial ordinals, then 
sup, is also an initial ordinal. To generalize, we first define: 


Definition 702. If ~ is an ordinal and A is a set, we write ~w < A to mean that a is 
the ordinal of a well-order defined on some subset of A. In other words, a < A if 
there is an injection from W(q) into A. 


For example, a < N if and only if a is a countable ordinal. 


Definition 703 (The Hartogs Set H(A), Ordinal w(A), and Cardinal 8(A) of a 
Set). Let A be any set. We define H(A), the Hartogs set of A, to be the set of order 
types of well-orders defined on subsets of A, so that H(A) = {a | a < A}. The 
order type of H(A), denoted by (A), is called the Hartogs ordinal of A, and the 
cardinality of H(A), denoted by (A), is called the Hartogs cardinal of A. 


Theorem 704 (Hartogs’ Theorem). For any set A: 


1. H(A) is an initial set of ordinals, with H(A) = W(n) = {a| a < n}, where 
n = @(A) is the Hartogs ordinal of A. 

2. H(A) is not equinumerous to any subset of A, and so 8(A) £ |A|. 

3. @(A) is an initial ordinal with w(A) £ A, and so if A is well-ordered with ordinal 
a then w(A) > a@ and is in fact the least initial ordinal > a. 

4. If A can be well-ordered then &(A) is the next larger cardinal after | A|, that is 
(A) > |A| and there is no cardinal k such that |A| < k < S(A). 


Problem 705. Prove Theorem 704. 


If |A| = |B| then clearly H(A) = H(B). Thus H(A), w(A), and &(A) depend only 
on the cardinality of A. Hence the following definition makes sense. 


Definition 706 («+ and w*(a)). Fora cardinal « and an ordinal q, define 


1. kt := 8(A) = |A(A)|, where A is a set of cardinality x. 
2. wt (a) := w(W(a)) = the Hartogs ordinal of {B | B < a}. 


Thus for any cardinal «, we have xt < x, and if « is the cardinality of a set which 
can be well-ordered then «+ is the least cardinal greater than «. If a is an ordinal 
then wt (q) is the least initial ordinal > a. 
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Definition 707 (w, and &,.). For every ordinal w we define the ordinal wy by 
transfinite recursion as: 


Wo := Oo, 
Wy+1 = ©+ (Wy) = least initial ordinal > a, 


@¢ i= supa@,, if € is a limit ordinal. 
y<é 


Finally, define &y := |W(@.)| = |{v| v < @g}I. 


Theorem 708. /. Every @, is an initial ordinal with |W(@q)| = Xa- 

2. a < B > Wy < wp andsoa < B > Ry < Rp. 

3. Neti = xr, and if v is a limit ordinal then &, = supy<, Xa. 

4. Every infinite initial ordinal equals wy for some ordinal a. 

5. If an infinite set A can be well-ordered, then |A| = Xq for an ordinal a. 


Problem 709. Prove Theorem 708. 


We thus have the series of all initial ordinals as: 


0O<1<2<-++-<@<@, <@ <::: < Wy <:':, 
where, after the finite ordinals, @ = wp is the only countably infinite initial ordinal 
and W, is the a-th uncountable initial ordinal for a > 0. 

The infinite cardinals in the series 


No < Bp < Ro <-e < Ny <cee 


are called alephs. By the last part of Theorem 708, the above series of alephs gives a 
well-ordered enumeration, indexed by the ordinals, of all infinite cardinals of well- 
orderable sets. 

We will soon see that every set can be well-ordered using the Axiom of Choice 
(the well-ordering theorem). Hence, under the Axiom of Choice, every infinite set 
has cardinality %q for some ordinal a and every infinite cardinal is an aleph. Thus, 
fora > 0, &q is the a-th uncountable cardinal (under AC). 


Earlier we had proved that 8{ = &). Essentially the same proof yields: 


2 


x2 =), and sounder AC: k* =k 


for any infinite cardinal «. From these relations and the Cantor—Bernstein theorem, 
the sum and product of any two alephs are determined completely as follows: 


Theorem 710. Xy + 8% = No&s = Nmaxia,6) = Max(No, Ng). 


We also have, by induction, 8” = &, for any nonzero finite cardinal n. 
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On the other hand, when the exponent is infinite, it is impossible to compute 


cardinal powers such as xi or 286 as an aleph even using AC. (Without AC, we 
cannot prove that such a cardinal is an aleph.) 


Problem 711. Show by examples that for arbitrary orders X: 


1. The Bolzano—Weierstrass property does not imply the strong Nested Intervals 
property. 

2. The strong Nested Intervals property does not imply the Bolzano—Weierstrass 
property. 

3. The Bolzano—Weierstrass property together with the strong Nested Intervals 
property does not imply completeness. 

4. The sequential NIP does not imply that either the strong Nested Intervals 
property or the Bolzano—Weierstrass property holds in X.. 


[Hint: Consider orders of type @ + *@ 1, @; + *@1, @; + *wy, etc.] 


10.4 Abstract Derivatives and Ranks 


Definition 712 (Derivative Operators). Let X be any set. A mapping V: P(X) > 
P(X) will be called a derivative operator if V(E) © E forall E © X. (A derivative 
may also be referred to as a contraction or reduction.) 


An example of a derivative operator in the context of orders is the following: Put 
V(E) := EN D(E), where D(E£) is the set of limit points of E. In other words 
V(£) is obtained from E by removing the isolated points of E. This is the Cantor— 
Bendixson derivative, and gives rise to Cantor-Bendixson ranks, which will be 
studied in detail later. The notion of derivative as defined here is an abstract version 
of the Cantor—Bendixson derivative. 


Definition 713 (Strict Derivatives). A derivative operator V: P(X) > P(X) will 
be called strict if V(E) ¢ E whenever E 4 @. 


One can naturally iterate a derivative operator and define by transfinite recursion 
sets X, where a is any ordinal, as follows: 


AOS 
XD = V(X), and, 


X®:=()X, if € isa limit ordinal. 


a<& 


The sets X¥ decrease with w so that we have 


XH=XODXVND...DX¥M DXOEHM dD... , 
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but the following theorem shows that the process must “stabilize” at an ordinal, 1.e., 
there is jz such that ¥“*+) = X¥™ (and so X¥ = X™ for alla > p). It isa 
theorem which, in an abstract setting, assigns ordinal ranks to certain elements of 
the set X without using the Axiom of Choice. We will have many occasions to use 
the general framework of this theorem. 


Theorem 714 (Rank Decomposition for Derivative Operators). Let X be a set, 
V: P(X) — P(X) be a derivative operator, and n = w(P(X)) be the Hartogs 
ordinal for P(X). For each a < n, define the set X, called the a-th iterated 
derivative of X, by transfinite recursion as follows. 


Xa, 
eX). cand. 
X® s= () X®, if & isa limit ordinal. 
a<é 
Then 


1. The sets X“ decrease with a, and there exists a unique least ordinal ju < n for 
which X = X“*), so thatX™ = X™ for alla > pw, but X™ ¢ X™ for 
a <p: 


XYH=aVX¥ODd XM D...KXOMD KOM dD...y¥M = yt) = Xl) 


where X‘°°) denotes the “stabilized smallest set X\“” among the X™. 
2. The set X~X‘©) = X~X) is partitioned as: 


XXX) = |_J XOXO), with XOX) 4 GO forall a < p. 


a<p 


3. Consequently, if for each x € XXX‘) we put p(x) = py(x) := the least a < 
such thatx € X®~X@*), then p = py:X~X©) + W() is the unique 
“ordinal rank function” such that for any x € X ~ X‘©) and any ordinal a: 


pix)=a & xe XM YOM, 


If p(x) = a, we say that the element x has rank a, and thus XXX @*" consists 
precisely of the elements of rank a. (Put p(x) = 00 if x € X‘°.) 


4. If V is a strict derivative then X = X‘%) = @, and so dom(p) = X, i.e, 
p: X — W(,), and every element in X has an ordinal rank. 


Proof. 1. The fact that the sets X decrease is immediate from their definition via 
a routine transfinite induction. If we had X® 4 X@*) for alla < 7, then 
the mapping a +> X™) is easily seen to be a one-to-one mapping from W(n) 
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into P(X), which is impossible since 7 is the Hartogs ordinal for P(X). Hence 
X@ = X@*) for some ordinal w < 7, and we can therefore take ju to be the 
least such ordinal. 

2. This is immediate from the definition of the sets X. 

3. The function p = py is readily defined as the relation: 


pv = {(x,a) © X x W(p)| xe XON KOM), 


4. If V is strict, then X ‘© 4 @ would contradict V(X‘) = X ©), o 


Remarks. (1) Note that the theorem does not use the Axiom of Choice. (2) The 
relation on X ~.X °°) defined by p(x) < p(y) is called a pre-well-ordering. 


Problem 715 (Monotone Operators). A mapping V:P(X) — P(X) is called 
monotone if A C B => V(A) © V(B). Show that if V is a derivative on X which 
is also monotone, then the set X ‘© is the largest fixed point of V, that is, we have 
V(X %)) = X ©), and if V(E) = E then E CX), 


Problem 716. Show that if a derivative V:P(X) — P(X) has the property that 
|E~XV(E)| < 1 for all E, then the rank function p = py must be one-to-one on 
X~X ©), If, in addition, V is strict, then p: X — W() is a bijection from X onto 
the set W(t) of ordinals < [. 


10.5 AC, Well-Ordering Theorem, Cardinal Comparability 


Recall the two versions of Axiom of Choice that we have seen earlier: 


e Axiom of Choice, Partition Version. Every partition has a choice set. If P is a 
family of pairwise disjoint nonempty sets then there is a “choice set” C for the 
partition P satisfying |C M E| = 1 for every E € P. 


¢ Axiom of Choice, Choice Function Version. Every family of nonempty sets has a 
choice function. Equivalently, for any set A there is choice function g: P*(A) > 
A with g(E) € E for all E € P*(A), where P*(A) := P(A) ~{@} is the 
collection of all nonempty subsets of a set A. 

We had seen that the two forms are equivalent. In the theorem below, we use the 


Choice Function version to show that the Axiom of Choice implies the well-ordering 
theorem, which says that every set can be well-ordered. 


Theorem 717 (Zermelo’s Well-Ordering Theorem). The Axiom of Choice 
implies the well-ordering theorem (that every set can be well-ordered). 


Proof. Let A be an arbitrary set with a choice function gy: P*(A) — A. The idea of 
the proof is to use the choice function ¢g to well-order A as follows: Let 
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ao = 9(A), a1 = o(Axn{ao}), a2 = g(Anx{ao, a1}), ete., 


and in general we keep putting 


dg = p(Ar{ag| B < @}), 


until we exhaust A. We now formalize this using the framework of abstract iterated 
derivatives and ranks (Sect. 10.4, Theorem 714). 
Define a strict derivative operator V: P(A) — P(A) by 
E E if E 
E ifE =@. 


Let A denote the a-th iterated derivative of A, and lt be the least ordinal with 
AM = AUF) = Al). Then A) = @ since V is strict, and so there is a rank 
function p = py: A > W(y) such that for all x € A, p(x) = a (ie., x has rank 
a) if and only if x €¢ AM ~V(A),. Since A@ ~V(A) has at most one member 
(A), no two distinct elements can have the same rank, i.e., the rank function p 
is injective (in fact a bijection from A onto W(jz)). So the ordering defined on A by 
the relation 


x<y @ p(x) < p(y) (x,y € A) 


is a well-ordering of A (of order type jz). a) 
Since the well-ordering theorem clearly implies the Axiom of Choice, we have: 
Corollary 718. The Axiom of Choice is equivalent to the well-ordering theorem. 


If A and B are any two sets then using the well-ordering theorem we can well-order 
each of them, and consequently by comparability of well-orders one of them must 
be isomorphic to an initial segment of the other; in particular, there is an injection 
from one of the sets into the other. Thus we have: 


Theorem 719 (Cardinal Comparability). The well-ordering theorem implies that 
for any two sets one of them is equinumerous to a subset of the other, and therefore 
that cardinal comparability holds: For any two cardinals k and jt either kK < fu or 
kL < Kk, and thus exactly one of k < JL, K = LL, K > pis true. 


Conversely, cardinal comparability implies the well-ordering theorem. To see this, 
let A be any set and let H(A) be the Hartogs set of all ordinals of well-orderings 
defined on subsets of A. Then by Hartogs’ theorem H(A) is not equinumerous 
with any subset of A and so by cardinal comparability A is equinumerous with 
some subset of H(A), which is well-ordered. Hence A itself can be well-ordered. 
It follows that: 


Theorem 720. The well-ordering theorem is equivalent to cardinal comparability. 
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The above results can be summarized as: 


Theorem 721 (Equivalents of AC). Without using the Axiom of Choice, we can 
prove that any of the following assertions implies the others: 


1. The Axiom of Choice (either the partition or the choice function version). 
2. The Well-Ordering Theorem: Any set can be well-ordered. 
3. Cardinal Comparability: If «, 4 are cardinals then either k < mor < Kk. 


Note that by the well-ordering theorem, any set is equinumerous with W(a) for 
some a, and hence to W(f) for some initial ordinal W(8), and so the cardinal 
number of any infinite set is an aleph. Thus we have a well-ordered enumeration 
of all cardinals as 


0<1<2<-+-< No < Ry <-e- << Ry <i < Ny <i: 


Since every infinite cardinal is an aleph and the alephs are well-ordered under the 
relation < for comparing cardinals, it follows that any set of cardinals is well- 
ordered. This allows us to use phrases like “the least cardinal with such and such 
property.” Moreover, any set of cardinals has a unique cardinal as the least upper 
bound (supremum) of the given set. 

All the facts of the last paragraph assume the Axiom of Choice, under which the 
infinite cardinals (as alephs) correspond naturally to the infinite initial ordinals wy, 
in a one-to-one fashion, allowing us to informally identify the infinite cardinals with 
the infinite initial ordinals w,. 


Corollary 722 (AC). Under the Axiom of Choice, every infinite cardinal is an 
aleph, and therefore for any two infinite cardinals k and |4, we have k + wb = 
KUL = max(k, L). In particular, k? = « for all infinite cardinals k. 


Problem 723. AC holds if and only if k < «* for every infinite cardinal x. 


Problem 724 (Tarski). The Axiom of Choice follows from the assumption that 


k? = « for all infinite cardinals kx. 


[Hint: Given any infinite set A, fix a well-ordered B with |B| = 8(A) = |A|* and 
AN B=@.If x? =x forall x, then |A||B| < |A| + |B], so there is an injection 
f:AxB— AUB.Now f[{a}xB]N B # @ for alla € A, so we get an injective 
g: A — B, where g(a) := the least element in f[{a}x B] 1M B.] 


10.6 Cofinality: Regular and Inaccessible Cardinals 


Proposition 725. Let X be a nonempty order without a greatest element and 
suppose that |X| = &y. Then X has a well-ordered cofinal subset whose order 
type is < Wy. In particular, any countable order without a greatest element has a 
cofinal subset of type w. 
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Proof. Since |X| = Na, we can enumerate the elements of X indexed by ordinals 
< Wy as 


X = {a,| v < ay}. 
Define the subset J of W(@.) by 
I := {v| v < @y anda, > ag in X for all € < v}. 


I is nonempty since 0 € J, and if € < v arein J then ag < a, in X. Hence the 
suborder C of X defined by 


C := {ay|v € T} 


has the same order type as J, and so is well-ordered with type < @y. 

Finally, C is cofinal in X. For otherwise, we could get the least B < @g with 
ag > x in X for all x € C. Then for any € < B if & € J thenag € C so we would 
have ag < ag, andif & ¢ J then there would be the least w < & such that a, > a¢ 
and for this 44 we must have a, € C which implies ag < a, < ag. In either case, 
we have dg < ag in X for any € < B. Hence B € I, so ag € C, contrary to our 
assumption. Oo 


Corollary 726. Let X be an order without a last element. If X has countable 
cofinality, then X has a cofinal subset of order type a, that is, there exists a strictly 
increasing sequence X1 < X2 < +++ < X, < +--+ in X such that {x, | n € N} is 
cofinal in X. 


Corollary 727. Let X be a nonempty well-order without a largest element. Then 
the least ordinal ft such that X has a cofinal subset of type jt is an infinite initial 
ordinal L = Wy. 


Proof. Let C be a cofinal subset of X of order type jz and let |C| = &, so that 
Oy < je. If were not an initial ordinal, we would have wy < ju. By the proposition, 
there is E C C such that F is cofinal in C and the order type of F is < wy. Since 
E is cofinal in C and C is cofinal in X, therefore E is cofinal in X¥. Hence X has a 
cofinal subset of type < @y < jz, a contradiction. oO 


From the corollary it follows that for every well-order X there is a unique smallest 
cardinal jz such that X has a cofinal subset of cardinality pu. 


Definition 728 (Cofinality of Well-Orders and Ordinals). The cofinality of a 
well-order X is the least cardinal yz such that X has a cofinal subset of cardinality 
jt. The cofinality of an ordinal o is the cofinality of the well-order W(a) = {8 | 


B <a}. 
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From the definition and the previous corollary it follows that if a is the ordinal of 
a nonempty well-order X without a largest element, then the cofinality of a equals 
&,, if and only if X has a cofinal subset of type w, but of no smaller type. 

The cofinality of any successor ordinal is |. For a limit ordinal a, the cofinality 
of a equals &,, if and only if the least order type of subsets cofinal in W(a) (which 
must be an initial ordinal) equals w,,. In particular, the cofinality of any countable 
limit ordinal is &o, while the cofinality of w, is 81. 

For the rest of this section we will assume the Axiom of Choice so that every 
cardinal « equals an alephk = Ng. 


Definition 729 (Cofinality of Cardinals (AC)). The cofinality of a cardinal k = 
Ny, denoted by cf(x), is the cofinality of the ordinal wy, i.e., it is the least cardinal 
jz such that W(@,) has a cofinal subset of cardinality pw. 


Note that the definition of cofinality for cardinals requires the Axiom of Choice. 
Theorem 730 (AC). For any cardinal k > 0, 


1. cf(k) <x. 
2. cf(cf(k)) = cf(k). 


Proof. The first part is immediate. For the second part, suppose that k = Ng, 
cf(k) = &,, and cf(®,,) = &y, so thatk > w > v. Then W(@,) has a cofinal 
subset C of order type w,, but of no smaller type. Since cf(&,,) = &y, so W(@,), 
and hence the isomorphic order C, has a cofinal subset of order type w,. But if E is 
cofinal in C of order type ,, then E will be cofinal also in W(a,), and so w, will 
be < the order type of E which is m,. Hence w, = @,, and so X, = &p. oO 


Problem 731. [fa is a successor ordinal, then cf(®v) = Na. If a is a limit ordinal, 
then cf(&q) equals the cofinality of a. 


(Hint: For the first part use the fact that Ne = X¢.] 
Thus cf(&o) = No, cf(&1) = 1, while cf(&,,) = No. 


The following definition is based on cofinalities of cardinals and therefore 
assumes the Axiom of Choice. 


Definition 732 (Successor, Limit, Regular and Singular Cardinals). An infinite 
cardinal « is a successor cardinal if k = u* for some cardinal ju; otherwise x is a 
limit cardinal. x is regular cardinal if cf(k) = k; otherwise k is a singular cardinal. 


Thus Nq is a successor cardinal if and only if a is a successor ordinal, and x is a 
singular cardinal if cf(«) < «. For every a < w, the cardinal Xq is regular, and &,, is 
the smallest singular infinite cardinal (the next singular cardinal being &,+.)). Since 
every successor cardinal is regular, singular cardinals must be limit cardinals. Also, 
cf(«) must be regular for any cardinal x, since cf(cf(k)) = cf(k). We record these 
facts in the following proposition. 
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Proposition 733. cf(k) is a regular cardinal for every infinite cardinal k. Every 
successor cardinal is regular. Hence singular cardinals must be limit cardinals. 


Let us say that an infinite ordinal v is a regular ordinal if there is no cofinal subset of 
W(v) having order type less than v. By the following problem, the regular ordinals 
can be identified with the regular cardinals since they correspond to each other in a 
natural one-to-one fashion. 


Problem 734. Show that if v is an regular ordinal then v is an initial infinite ordinal 
and so V = Wa for some a. Moreover, the ordinal Wy is regular if and only if the 
cardinal &q is regular. 


Problem 735 (AC). Let « be an infinite cardinal and A a set with |A| = k. 


1. cf(k) is the least cardinal jz such that A can be expressed as the union of tL 
pairwise disjoint sets each of cardinality < x. This assertion remains true even 
if we do drop the qualifier “pairwise disjoint.” 

2. cf(k) is the least cardinal 1 such that k can be expressed as 

K= ye where |I| = wandk; <« fori €T. 


ie] 
3. cf(k) is the least cardinal ju such that k can be expressed as 
k= ee, where |I| = wandk; <« fori € 1. 
ié 
As mentioned before, very little can be said about cardinal exponentiation when the 
exponent is infinite, but we do have: 
Theorem 736 (Konig). [fk > 2 and pp > &o are cardinals, then cf(k") > [. 


Proof. It suffices to show that if x; < «" for alli € J with |J| = y, then 0,¢, Kj) < 
kK". So assume that x; < x” for alli € J where |/| = p. 
Using the original K6nig’s Inequality and the fact that 4 = ju, we get: 


i Ki < | [«“ = (Kh )k = Ke = KH, Oo 


ie] ie] 


It follows that the cardinality of the continuum ¢ = 2*° has cofinality > No, and so 
R cannot be expressed as the union of countably many sets each of cardinality less 
than c. 


Problem 737. 2®° = ¥,, for some n € N if and only if 88° > 2%, and 2®° > &, 
foralln €N if and only if 8®° = 2°, 


Problem 738 (Hausdorff’s Formula). Sas, = Rt Nh. 
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(@., *#,) Gaps in Orders 


We illustrate the use of cofinalities by giving characterizations for each of the weak 
completeness properties of orders introduced in Sect. 8.8, namely, the Bolzano— 
Weierstrass property, the strong nested intervals property, and the sequential nested 
intervals property. This is done by analyzing the cofinality and coinitiality of 
Dedekind gaps in orders. By the cofinality of an order X we will mean the least 
ordinal jz such that X has a cofinal subset of order type jz. The coinitiality of an 
order X is the cofinality of the reverse order *X . By Corollary 727, if X has no last 
element, then the cofinality of X is an initial ordinal wy. 


Definition 739 ((@,, *wg) gaps in orderings). An (w.,*wg) gap in an order X is 
a Dedekind partition L, U of X such that L has cofinality w, and U has coinitiality 
wg, that is, @y is the least ordinal jz such that L has a cofinal subset of type yz and 
@p is the least ordinal v such that U has a coinitial subset of type *v. 


Thus for any (@,,*wg) gap, @» and wg must be regular ordinals (i.e., Ny and Xz 
must be regular cardinals). In fact, we have the following. 


Problem 740. A Dedekind partition L, U of an order X is an (@y,*wg) gap if and 
only if %qy and &g are regular cardinals and there exist L' C LandU' CU such 
that L’ has order type @,, U' has order type *wg, for all x € L there is y € L' with 
y > x, and for all x € U there is y € U' with y < x. 


Thus an ordinary Dedekind gap is simply an (@,,*wg) gap for some a, B. In a 
countable order, a Dedekind gap must be an (wo, *@o) (or (@, *w)) gap. 


Problem 741 (Characterizing the Bolzano—Weierstrass Property). Show that 
an order X satisfies the Bolzano—Weierstrass property if and only if it has no 
(Wy,*wp) gap with a = 0 or B = 0 (i.e., it has no (@y,*w) or (@,*@p) gaps, 
or in other words, in every Dedekind gap, both the cofinality and coinitiality are 
uncountable). 


Problem 742 (Characterizing the Strong NIP). Show that an order X satisfies 
the strong nested intervals property if and only if it has no (Wy, *®w) gap (i.e., has 
no “symmetric” gap with identical cofinality and coinitiality). 


Problem 743 (Characterizing the Sequential NIP). Show that an order X satis- 
fies the sequential nested intervals property if and only if it has no (w, *w) gaps. 


Problem 744. Show that if X is an order which (a) has the Bolzano—Weierstrass 
property, (b) has the strong nested intervals property, and (c) has cardinality < &,, 
then X must be complete. Show that none of the three conditions in the hypothesis 
of this result can be dropped. 
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The uncountable initial ordinals we have seen so far, @1, @2,..., Mw etc., all have the 
property that the a-th uncountable initial ordinal wy is much larger than the ordinal 
a. For example | < @, 2 < @2, @ < Ww, and even @ < @,,. Equivalently, 8, has 
much larger cardinality than the set W(a) = {B| B < a} for any a < @, or even 
for any a < @,,. Hence any ordinal a such that w, = a (the a-the uncountable 
initial ordinal equals @ itself), or equivalently any ordinal such that the cardinality 
of W(a) = {B| B < a} equals Ny, must be larger than all the above ordinals. 


Problem 745. Show that if 
& = sup{@, Wy, Wo,> Dag. 9 5 } 


then Wy = a and so |{B| B < a}| = Xo. 


All uncountable limit cardinals we have seen so far, such as Xw, Noto, Ro,, etc., 
are singular. For the ordinal @ of the last problem satisfying my = a, Ny is a much 
larger limit cardinal but is still singular, since cf(®a) = Xo. 


Problem 746. Show that a +> Wg is a normal function, and so for every ordinal a 
there is B > a with wg = fp. 


Definition 747. An uncountable cardinal is weakly inaccessible if it is a regular 
limit cardinal. 


Problem 748. Show that if &q is a weakly inaccessible cardinal then Wy = a and 
so |{B| B <a}| = Ro. 


A weakly inaccessible must be quite large, but we also define: 


Definition 749. A cardinal x is a strong limit if 2" < x for all w < x, and k is 
strongly inaccessible if it is uncountable, regular, and a strong limit. 


A strong limit cardinal is clearly a limit cardinal, and hence a strongly inaccessible 
cardinal is weakly inaccessible. It is not possible to show that strongly inaccessible 
cardinals exist,! nor that weakly inaccessible cardinals exist. 

Cardinals which cannot be shown to exist using the standard axioms of set theory 
are called large cardinals. Inaccessible cardinals are the simplest examples of large 
cardinals. The subject area of large cardinals studies much larger cardinals and 
the consequences of adding their existence as axioms. Such axioms have resolved 
some classical mathematical problems which could not be decided under the usual 
axioms, although one famous problem has been stubborn in resisting resolution via 
large cardinals. It is the Continuum Hypothesis, which we discuss next. 


‘From the standard axioms for set theory (such as ZFC), assuming they are consistent; this follows 
from a result of Gédel known as the second incompleteness theorem. 
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Both 2%° and 8&1 are uncountable cardinals, and Cantor conjectured that 
2®o = &, 


an assertion known as the Continuum Hypothesis (CH). Cantor implicitly assumed 
the Axiom of Choice, by which 2®° is an uncountable aleph, i.e., 


Q®o = X, forsomea >1, andso: Xo >. 


Hence under AC, CH is equivalent to the statement that every set of reals is either 
countable or equinumerous to R. (Without AC, we cannot prove that R can be well- 
ordered, or even that R has a subset of cardinality &1.) 


The CH is perhaps the most famous problem of set theory, and the problem of 
settling it is known as the continuum problem.” All attempts to settle the CH by 
Cantor and by other early twentieth-century mathematicians failed, even though 
for most effectively defined sets of reals, they could prove them to be either 
countable or equinumerous to R. Cantor and Bendixson proved the important result 
that every closed set of reals must either be countable or be equinumerous to 
R, which can be informally expressed by saying that “the closed sets satisfy the 
CH.” Other mathematicians extended the result to show that a larger class of sets 
known as analytic sets satisfy the CH. We will prove both these results in the next 
part (Corollary 1081, Theorem 1160). However, Lusin introduced other effectively 
defined sets of reals which could not be proved to satisfy the CH. The magnitude of 
the cardinal 2®° turned out to be very difficult to estimate, and much of research in 
set theory was driven by investigations into the continuum problem. 


By KGnig’s theorem, we have 
cf (28°) > &p, 
so 2®0 cannot equal any cardinal of countable cofinality, e.g., 
2% 2N, 2 xn,, 2 xR, 280 Nao, etc. 


That is about as much as we can say about the magnitude of 2®° as an aleph using 
the current standard axioms of set theory. In later research, first Gddel introduced 
the notion of constructible sets to show that CH cannot be disproved (assuming that 
the standard axioms of set theory are consistent). Then Cohen invented the powerful 
technique of forcing to show that CH cannot be proved either. Extending Cohen’s 
result, Solovay showed that one can consistently assume that 2*9 equals &, for any 
a so long as X, has uncountable cofinality. 


“Tt was the first in Hilbert’s celebrated list of problems presented in 1900. 
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Problem 750. Show without using the Axiom of Choice that 
hoe" 
{Hint: For each a < q@), consider the subcollection of P(Q) consisting of those 
subsets of Q which are well-ordered and have order type @.] 
Problem 751. Define the cardinals 1,,n = 0,1,2,..., as 
Ao := No, and Ang, = 2™. 
Thus 3, = ¢, Ay = f, etc. Show without using the Axiom of Choice that 


Ri < Jon, foralin =0,1,2,.... 


Under the Axiom of Choice, one can define 2, for all ordinals a and it readily 
follows that 8, < 2, for all a. On the other hand, by the Cohen—Solovay result 
mentioned above, no inequality of the form 


280 < Ry, 
however large a may be, can be obtained using the usual axioms of set theory. 
Problem 752. Show that CH holds if and only if 88° < 38°, 
Problem 753. Show that CH implies po = &,, for all finite cardinals n > 0. 


[Hint: Use induction, and the fact that cf(&,,) > No forn > 0.] 


The Generalized Continuum Hypothesis 


After CH, Hausdorff introduced the much stronger statement: 
QRa — a+1 for every ordinal a, 


which is known as the Generalized Continuum Hypothesis (GCH). It is a much 


stronger assumption and the cardinal power he can be completely determined 
using GCH. Using the notation 2, as above, GCH holds if and only if 8, = 3, for 
a. Under an additional assumption called the Axiom of Foundation, GCH becomes 
equivalent to the statement 


2* =x? for every cardinal x, 


which readily implies the Axiom of Choice. 


Problem 754. Show (without using AC) that if 2° = «+ for every cardinal k, then 
the Axiom of Choice holds. 


Problem 755. Under GCH, show that if &,, < ® +41 then xi = Ry4- 
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n1-Orderings and the CH 


The 7, orderings were introduced and studied by Hausdorff as a generalization 
of the order type 7 to the next higher cardinal &;. The behavior of 7; orderings 
of cardinality &; is similar to that of countable dense orderings without endpoints 
(Problem 766, Problem 767, Corollary 768). However, as we will see, the problem 
is inextricably linked to the CH, since the existence of 7; orderings of cardinality 
1 is equivalent to the CH (Problem 764). 


Definition 756 (7; orderings). An order X is an 1 ordering if for any countable 
A,B C X,if A < B then there exists p € X such that A < p < B. 


Problem 757. Every n, ordering is a nontrivial dense order without endpoints in 
which every countable subset is bounded (both above and below). 

In particular, every sequence is bounded, but no strictly increasing sequence is 
convergent (i.e., a strictly increasing sequence does not have a supremum). 


{Hint: The sets A and B in the definition are allowed to be empty.] 


Problem 758. An order is an n, order if and only if it is a nontrivial dense 
linear ordering without endpoints in which every sequence is bounded, no strictly 
monotone sequence is convergent, and there are no (w,*w) gaps. 


Problem 759. Jn an 7 ordering a nonempty open interval {x | a < x < b}is an 
ny ordering, but not all open segments are n, orderings. 


Problem 760. An 7, order X contains suborders isomorphic to Y for every order 
Y with |Y| < &, (and so X has suborders of type a for every a < 2). 


Definition 761 (Lexicographic Powers). If X is an order and a is any ordinal, we 
define an order on X"" by defining, fora,b ¢ X”, a < b if and only if there is 
— <a such that a(&) < b(&) and a(n) = b(n) forall n < &. 


Problem 762. Let H, be the suborder of the lexicographic power {0, 1}"@v 
consisting of all binary w,-sequences with a last 1, that is: 


Ay = { (ay) yeu, € (0, © | 3B < wi (ag = 1 anda, = 0 Vv > B)}. 


1. HA, is an n ordering of cardinality c. 

2. Letc = (c,| v < a1) € {0, 1} "© be the binary w-sequence defined by setting 
cy = 1 if v is even, and c, = 0 if v is odd. Let L = {a € H,| a < c} and 
U = {a € H,|c <a}. Show that L,U forma (a,*@) gap in M. 

3. Show that H, has exactly 2®' many (@,,*@) gaps. 


Problem 763. Show that every 1 ordering contains a subset isomorphic to R and 
so has cardinality at least ¢. 


Problem 764. Show that the Continuum Hypothesis is equivalent to the statement 
that there exists an n, ordering of cardinality &). 
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The Dedekind completion of an 7; ordering cannot be an 7; ordering. However, we 
have the following result. 


Problem 765 ((@1, *#1) Completion). Suppose that X is a suborder of an order 
Y such that for any p € Y~X, p is a two-sided limit point of X and the sets 
L:= {x € X|x < p}andU := {x € X| p < x} forma (1, *w1) gap in X. 
Show that if X is an n, ordering then so is Y. 

Conclude that for any n, ordering X, the “(@,,*@,) completion of X,” i.e., the 
ordering obtained by “filling in” all the (@, *w,) gaps in X, is an n ordering which 
has no (@, *@1) gap. 


The following problems form the “7; analogues” of Cantor’s theorem on countable 
dense linear orders without endpoints (characterizing the order type 7) and the proof 
that R is uncountable that follows from it. Note, however, that the results are vacuous 
unless we assume CH since without CH there are no 7; orderings of cardinality &; 
(Problem 764). 


Problem 766. Any two n, orderings of cardinality &, are order isomorphic. 
(Hint: Mimic Cantor’s “back-and-forth” proof. ] 
Problem 767. Any n, ordering of cardinality &, must have (@,,*q@\) gaps. 


(Hint: Removing a point from any 7; ordering produces an n; ordering with a 
(a1, “@) gap.] 


Corollary 768. Any 7, ordering without (@, *@,) gaps has cardinality > &. 


If X is an 7; ordering of cardinality &,, then by Problem 765 the “(a , *w1) 
completion” of X will be an 7, ordering without (@,,*@,) gaps, and so will have 
cardinality > &,. Thus just as a countable dense linear order has uncountably many 
irrational Dedekind gaps, similarly every n, ordering of cardinality 8; has more 
than 8; (@), *w) gaps. 


Chapter 11 
Posets, Zorn’s Lemma, Ranks, and Trees 


Abstract This chapter covers the very basics of the following topics: Partial orders, 
Zorn’s Lemma and some of its applications, well-founded relations and ranks on 
them, trees, K6nig’s Infinity Lemma, well-founded trees, and Ramsey’s theorem. 


11.1 Partial Orders 


A linear order < is an irreflexive transitive relation which is also connected, i.e., if 
x # y then either x < y or y < x (any two distinct elements are comparable). By 
dropping this last condition of comparability, we get the more general notion of a 
partially ordered set or simply a poset. 


Definition 769. A strict poset is a pair (P, <) where P is a set and < is a binary 
relation on P which is irreflexive (x 4 x for all x) and transitive (if x < y and 
y < zthen x < z) on the set P. (Note that “a 4 b” means that “a < b is false.’”) 


It is easy to verify that if (P, <) is a strict poset, then the relation < is asymmetric 
on P (forall x,y € P,ifx ~ y then y & x). 
Posets also come in an essentially equivalent “reflexive” variety: 


Definition 770. A reflexive poset is a pair (P, <) where P is a set and < is a binary 
relation on P which is reflexive (x < x for all x), antisymmetric (if x < y and 
y x x then x = y), and transitive (Gif x <x y and y < zthen x < z) on the set P. 


It is easy to verify that if (P, <) is a strict poset, then the relation < on P defined 
by x x y & x ~ yorx = y makes (P, <) a reflexive poset, from which the 
original strict poset can be recovered by defining x ~ y © x < yandx # y. 
Similarly, one can start from a reflexive poset (P, <), then get a strict poset by 
setting x < y & x < y and x ¥ y, and recover back the original reflexive poset 
as before. This gives, for each set P, a natural one-to-one correspondence between 
the strict posets and reflexive posets over P. 
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Thus the notions of reflexive and strict posets are essentially variants of the same 
concept, analogous to that of inclusive set inclusion C and proper set inclusion ¢. 
From now on we will use the term poset to denote either a reflexive or a strict poset, 
as determined by context or notation. 


Problem 771. Every linear order is a poset. Every subset (restriction) of a poset is 
a poset. 


Definition 772. Let (P,<) be poset, A C P, anda € P. We say that 


1. ais lower bound of A if.a X x forall x € A. Upper bounds are similarly defined. 

2. ais a least element of A, written as a = min(A), if a is lower bound of A which 
is also a member of A. Greatest elements are similarly defined. 

3. a is a minimal element of A if a € A and there is no x € A distinct from a with 
x < a. Maximal elements are similarly defined. 

4. a is the least upper bound or supremum of A, written asa = VA ora = sup A, 
if a is an upper bound of A and a < x for every upper bound x of A. Greatest 
lower bounds or infimums are similarly defined. 


Note that a set A can have at most one least element, at most one greatest element, 
at most one supremum (least upper bound), and at most one infimum (greatest lower 
bound). 


Definition 773. Let x and y be elements of a poset (P, <). We say that: 


1. x and y are comparable if either x < y or y x x; otherwise, x and y are 
incomparable. 

2. x and y are compatible if there is z such that z < x and z < y; otherwise, x and 
y are incompatible. 


In a linear order, every pair of elements are comparable and therefore compatible. 
In a poset with a least element, every pair of elements are compatible. 


Definition 774. Let A be a subset of poset (P, <). 


1. A is called an initial part of the poset (P,~<)} or a downward closed subset of P 
ifforallx,yeP,xeAandyxx>S yea. 

2. Ais bounded above (in P) if there is an element of P which is an upper bound 
of A. 

3. A is called a chain if A is linearly ordered by <, 1.e., if any two elements of A 
are comparable. 

4. Ais called an antichain if any two elements of A are incompatible. 


A most important example of a poset is obtained by taking any family of sets with 
set inclusion C as the ordering relation. 


Problem 775. Let X be a set, and P = P(X). Then both (P, C) and (P, >) are 
posets. What are the least and greatest elements of (P, <)? If A is the collection of 
all nonempty proper subsets of X, then what are the minimal and maximal elements 
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of A in the poset (P, <)? Does A have least or greatest elements? If B is an initial 
part of (P, C) which is also a chain, what can you say about B? 


Problem 776. Let P be the set of nonnegative integers and for x,y € P define 
x < y & x divides y. Then (P,~) is a poset. Does P have least or greatest 
elements? Let A = {n € P| n > 2}. What are the minimal elements of A? Give an 
example of an infinite initial part of P which is a chain. 


Theorem 777 (A representation theorem for posets). Let (P,~<) be a reflexive 
poset. Then there is a set X, a subset S C P(X) such that (P,X) is isomorphic 
to (S, C); that is, there is a bijection F: P > S such thatVx,y € P,xxy © 
F(x) © F(y). 

Proof. Define, for x € P, F(x) := {y € P| y xX x}. Now put X := P, and 
S := {F(x)| x € P}. Oo 
Problem 778. Find a chain C in the poset (P(N),¢) such that C is order 
isomorphic to R under its usual ordering. 


[Hint: Try using P(Q) instead of P(N).] 


Problem 779. Consider the poset P = N~{1} with divisibility as the ordering 
relation. Find a necessary and sufficient condition for a subset to be an antichain. 


Definition 780 (Increasing Maps, Embeddings, and Isomorphisms). Suppose 
that (P, <p) and (Q, <q) are strict posets, and let f: P > Q. We say that 


1. f is strictly increasing if x <p y > f(x) <o f(y) (forall x, y € P). 
2. f isan embedding if x <p y & f(x) <o f(y) (forall x, y € P). 
3. f is an isomorphism if f is a bijective embedding of P onto Q. 


Problem 781. Suppose that (S,<) is a linear order, (P, <) is a strict poset, and 
f:S — P. Show that if f is strictly increasing, then it must be an embedding. 


11.2. Zorn’s Lemma 


An extremely useful consequence of AC is known as Zorn’s Lemma, which asserts 
that if in a poset every chain is bounded above then the poset has a maximal element. 


Theorem 782 (Zorn’s Lemma). The Axiom of Choice implies that if every chain 
in a poset is bounded above then the poset has a maximal element. 


Proof. Let (P,~<) be a poset in which every chain is bounded above, and let 
yg: P*(P) > P be achoice function (so that g(£) € E whenever E is a nonempty 
subset of P). For each E' C P, let U(E) be the set of those elements of E which 
are upper bounds of its complement P ~ E: 


U(E) := {x| x € E and y < x forall y ¢ PNE}. 
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Define a derivative operator V: P(P) > P(P) by 


V(E) = ee if U(E) is non empty, 


otherwise. 


As usual, let P“ denote the w-th iterated V-derivative of P. We then have a least 
ordinal jz such that P“*+) = P™, which means the set C := P~ P™ contains 
all its upper bounds. Now C is partitioned as C = Ug<,, P™~ P*), and for each 
a < p, the set P™ ~ P+ is a singleton whose member is an upper bound of 
P~P™., Thus C is a chain. Let p be an upper bound of C. Since C contains all 
its upper bounds, p € C is the greatest element of C, and so p must be a maximal 
element of P. Oo 


The converse of the above implication is also true. 


Proposition 783. Zorn’s Lemma implies the Axiom of Choice: If it is true that any 
poset in which every chain is bounded above must contain a maximal element, then 
the Axiom of Choice holds. 


Proof. Let P be any family of pairwise disjoint nonempty sets, and consider the 
collection C of those subsets A of UP such that |AM E| < 1 for every E € P. 
Then C forms a poset under set inclusion in which every chain is easily verified to 
have an upper bound. Hence C has a maximal element M, for which we will have 
|M 1 E| = 1 for every E € P. Thus Zorn’s Lemma is another equivalent of AC. 
oO 


In many applications of AC, Zorn’s Lemma facilitates and simplifies proofs. For 
example, using Zorn’s Lemma, one can readily establish the following standard 
mathematical result: 


Problem 784. In any vector space, every linearly independent subset is contained 
in a maximal linearly independent subset (called a basis).' 


The next problem gives another equivalent of AC known as the Hausdorff Maximal 
Principle. 


Problem 785. Show without using the Axiom of Choice that Zorn’s Lemma is 
equivalent to the statement that every chain in a poset is contained in some maximal 
chain. 


Combining all the equivalents of AC that we have obtained, we get: 
Theorem 786. The following conditions are equivalent to each other: 


1, The Axiom of Choice, Partition Version. Every partition has a choice set. 


'Blass has shown that this statement is actually equivalent to the Axiom of Choice. 
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2. The Axiom of Choice, Choice Function Version. Every family of nonempty sets 
has a choice function. 

3. The Well-Ordering Theorem. Every set can be well-ordered. 

4. Cardinal Comparability. If«, are cardinals then either k < orp < kK. 

5. Zorn’s Lemma. A poset in which every chain is bounded above has at least one 
maximal element. 

6. Hausdorff Maximal Principle. Every chain in a poset is contained in a maximal 
chain. 


11.3 Some Applications and Examples 


We will now see some examples of applications of Zorn’s Lemma and the Hausdorff 
Maximal Principle. Throughout this section, we will assume the Axiom of Choice. 


Almost Disjoint Families 


The following is a direct generalization of Definition 392. 


Definition 787 (Almost Disjoint Family). If X is an infinite set with k = |X|, we 
say that C C P(X) is an almost disjoint family of subsets of X if 


1. |E| =« forall FE eC. 
2. If Fy, Ey € Cand FE, 4 E> then|E, 9 Ey| <k. 


Let X be an infinite set, and define a relation C* on P(X) by A c* B © |AxB| < 
«and |BNA| =k. 


Problem 788. P(X) is a poset under C* in which the minimal elements are 
precisely the subsets of X of cardinality < x = |X|. 


Let us remove the minimal elements of (P(X), C*) to obtain the “subposet” 
P.(X) := {E| E C X and |E| = x}, which does not have any minimal element. 
Now note that C is an almost disjoint family of subsets of X if and only if C is an 
antichain in the poset (P»(X), C*). 

Also, since k? = «, P(X) has an antichain of size «. We will show that if « is a 
regular cardinal, then antichains of size > « can be obtained. 


Lemma 789. /f |X| = « is a regular cardinal and C is an almost disjoint family of 
subsets of X with |C| = k, then there is E © X such that E ¢ Cand CU {E} is 
still almost disjoint. 


Proof. Assume k = |X| is regular with k = Ny (say). Since |C| = «, we can 
enumerate C as C = {Eg | B < ay}, where |E; 1 E,| < x for & # n. Now for 
each B < @,, we must have E'g~Ugeg Eg = Eg~Ugcg (Eg N Ex) nonempty, since 


226 11 Posets, Zorn’s Lemma, Ranks, and Trees 


|Eg| = « while | Us<g (Eg M E¢)| < & by regularity of «. Hence by the Axiom 
of Choice we can pick xg € Eg for each B < wy. Let E := {xg| B < @y. Then 


|E| = |W(a.)| = kK since xg # x, for € # yn. Moreover, for any B < wy, we have 
Eg EC {x¢| € < }, 80 [Ep E| < |W(B)| <«. o 
Now, since x? = kK, we may fix a pairwise disjoint family Co of subsets of X 


all having size x. Notice that the union of any chain of almost disjoint families is 
itself an almost disjoint family. Hence the poset consisting of all the almost disjoint 
families containing Co and ordered by inclusion (of families) has the property that 
every chain in this poset has an upper bound. By Zorn’s Lemma we may fix a 
maximal almost disjoint family C containing Co. Then |C| > x, but we cannot 
have |C| = « since then C would not be maximal by the above Lemma. Hence we 
have: 


Theorem 790. Let X be a set of regular infinite cardinality k = |X|. Then there 
exists an almost disjoint family C of subsets of X with |C| > x. 


For the case where k = &o, we saw in Problem 393 that one can obtain an almost 
disjoint family of size 28° = ¢ in a highly effective fashion. The above theorem 
generalizes the result to larger cardinalities in a weaker fashion and is not effective. 


Problem 791. Assuming the Continuum Hypothesis show that a set of cardinality 
8, has an almost disjoint family of size 2°". 


{Hint: Problem 393 can help.] 


Short Linear Orders 


Definition 792 (Short Orders). A linear order is called short if it does not contain 
any suborder of type @, or *@,. 


In other words, X is short if X does not contain a strictly increasing or strictly 
decreasing w -sequence. 


Problem 793. A suborder of a short order is short. An order which is a countable 
union short suborders is itself short. If X and Y are short orders, then X x Y with 
the lexicographic ordering is short. 


Since the usual ordering on the separable continuums R and [0, 1] is short, we can 
get examples of many short orders, such as the (lexicographically ordered) non-CCC 
continuums [0, 1}, fork = 2,3,.... 


Problem 794. Let X be an order and Y is a short suborder such that for all x,y € 
X if there is z € X with x < z < y then there isaw € Y withx <w < y. Then X 
is short. 


We can manufacture more examples of short orders by taking “countable lexico- 
graphic powers” as follows. 
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Problem 795. If X is short and a < @, then the lexicographic power X“™ is 
short. 


[Hint: Use transfinite induction on a. Note that X¥"@*) is order isomorphic to 
X“@ x X with lexicographic order. If @ is a limit ordinal, fix a € X and consider 
the suborder Y of X consisting of those elements of X which take the eventually 
constant value of a. Then by induction hypothesis, Y is a countable union of short 
suborders, and an application of Problem 794 shows that X must be short. ] 


Problem 796. For any ordinal a, the lexicographic power {0,1}“@) does not 
contain any suborder of type ®y+1 OF *Wy+1- 


The following main result on short orders is proved using Zorn’s Lemma. 


Theorem 797 (Hausdorff). [fan order X is a union of &,-many short suborders, 
then X can be embedded in any 1 order. In particular, every short order can be 
embedded in any 1, order. 


Proof. The proof is based on the following extension lemma. 


Lemma 798. /f A is a short linear order, B is an n, order, S C A, and f:S > B 
is strictly increasing, then f can be extended to a strictly increasing map from all 


of A into B. 


Proof (Lemma). Let F be the family of all strictly increasing functions which extend 
f and map some subset of A into B. Partially order F under extension. Then by 
Zorn’s Lemma, there is a maximal member g € F. We claim that dom(g) = A. 
Otherwise there would exist a € ANdom(g). Let L := {x € dom(g) | x < a} 
and R := {x € dom(g) | a@ < x}. Since A is short, there exist countable sets P 
and Q with P cofinal in L and Q coinitial in R. Put C := g[P] and D := g[Q]. 
Then C and D are countable subsets of B with C < D (in B). Since B is an 
n. order, there is b € B with C < {b} < D. Define an extension h of g with 
dom(h) = dom(g) U {a} by setting h(a) = b and h(x) = g(x) for x € dom(g). 
Then / is a strictly increasing proper extension of g, contradicting the maximality 
of g. Oo 


To finish the proof of the theorem, let A be an order such that A = Ua iy Ag where 
each suborder A, is short. We can assume that the sets Ag increase with a (since 
otherwise we could replace Ay by J p<a 4g). Now let B be any 7; order. Using 
the lemma and the Axiom of Choice, we can build by transfinite induction strictly 
increasing functions fy: Ag > B,a@ < @ such thatifa < B < @, then fp extends 
Ju. Then the common extension of all the functions fy is a strictly increasing map 
from A to B. Oo 


Problem 799. Any order X with |X| > ¢ has a suborder of type w or *a. 


Recall that in Problem 762, we defined H; as the suborder of the lexicographic 
power {0, 1}”@ consisting of all binary @;-sequences with a last 1. H; was an 1 
order of cardinality ¢ containing 2®' many (@,, *@;) gaps. 
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Problem 800. H, can be expressed as the union of &, short suborders. Hence, 
every 7 order contains a suborder isomorphic to H,. 


Some Posets Containing 7; Chains 


The four related posets below all contain chains which are 7; orders. Consequently 
they contain chains of order type w and *q@ for each ordinal a < @». 


Problem 801 (Orders of Magnitude for Positive Sequences). Let S denote the set 
(R*)N of all sequences of positive real numbers, and for any x = (X,|n €N) €S 
and y = (y,|n €N) € S define 

Xn 
—=0. 


x < y ifand only if lim 
noo Yn 


(x < y is often written in the “little-oh notation” as X) = O()n).) Show that 


1. Under the relation <, S is a poset of size c. 

2. For any countable subset C of S, there exist x, y € S such thatx ~ C < y. 

3. If Aand B are countable chains in S with A < B (i.e. x ~ y forall x € A and 
y € B), then there is p € S such that A < p < B. 

4. Every maximal chain in S$ is an 1, ordering. 


Problem 802 (Orders of Infinity for Sequences with Limit co). Let M be the 
set of all sequences of natural numbers f € NN which approach +00, i.e., with 
lim, f(n) = +00. For f,g € M define f < g if and only if lim, (g(n) — f(n)) = 
+oo. Show that 


1. Under the relation <, M is a poset of size ¢. 

2. For any countable C C M,, there exist f, g € M such that f « C < g. 

3. If A and B are countable chains in M with A ~ B (i.e, f ~« g forall f € A 
and g € B), then there ish € M such that A <~h < B. 

4. Every maximal chain in M is an 1, ordering. 


Problem 803 (Ordering on P(N) modulo finite sets). Let P be the collection of 
all subsets A of N such that both A and its complement are infinite. For A, B € P 
define A < B if and only if AXB is finite and BX A is infinite. Assuming the Axiom 
of Choice, show that: 


. Under the relation <, P is a poset of size c. 

A and B are incompatible if and only if AN B is finite. 

. Every antichain of size &o is properly contained in another antichain. 

. There is an antichain of cardinality ¢. [Hint: See Problem 393.] 

If X,Y © P are countable chains such that A ~ B forall A € X and B € Y, then 
there is M such that A ~ M < B forall Ae Xand Be V. 

6. Any maximal chain in P must be an n, ordering. 


HAWN 
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Problem 804 (The Strict Dominating Order). For f.g ¢ NN, say that g 
dominates f and write f <* g if and only if there is m such that f(n) < g(n) 
for all n = m. Show that in the poset H := (NN, ar 


1. Every f € NN has an immediate successor, that is, there is g € NN such that 
f <* g and there isnoh with f <* h <* g. 

2. Every countable subset is bounded above. 

3. If fi ~* fo ~* ++) ~* fy ~* fog <* +++ <* f, then there is g with f, <* 
g <* f forall n. Thus no strictly increasing sequence has a supremum. 

4. Let A and B be countable chains in this poset with A <* B (i.e, f <* g for 
all f € Aand g € B). If either A has no maximum or B has no minimum, then 
there is h such that A <* h <* B. 

5. The poset H contains a chain which is an n, ordering. 


[Hint: Note that if f < gin M then f <* g in H (but not conversely), and so any 
chain in M is a chain in H, and by Problem 802 M contains 7; chains. ] 


It is a celebrated result of Hausdorff [60] that the poset H as well as the posets 
S, P, and M, all contain (@1, *w1) gaps (i.e., they contain maximal chains with 
(@1,*@1) gaps). We will not prove the result in full generality, but it is easy to 
derive it from the Continuum Hypothesis. 


Proposition 805 (CH). All four posets above have (@,*@) gaps. 


Proof. Assume the CH. In the posets S, P, and M, any maximal chain is an 7 chain 
of size 81, and so has (@, *#1) gaps by Problem 767. 

To get an (@1,*w1) gap in H, start with a maximal chain C in M having a 
Dedekind partition L,U, where L has a cofinal subset of type w; and U has a 
coinitial subset of type *w;. Now note that one cannot have an element f with 
L <* f <* U in H, since that would imply L ~ f ~ U in M. Extending C toa 
maximal chain in H thus retains the (@, *w#1) gap of C. oO 


For a proof of this result without assuming the Continuum Hypothesis, see [35] or 
[34]. 


11.4 Well-Founded Relations and Rank Functions 


Well-founded relations can be viewed as a generalization of well-orders. 


Definition 806. Let R be a relation on a set A and let B C A. Given x € B, we 
say that x is an R-minimal element of B if there is no y € B with yRx. We write 
ming[B] for the set of R-minimal elements of B. 


Definition 807. We say that the relation R is well-founded on the set A if every 
nonempty subset of A has at least one R-minimal element. 
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We say that (A, R) is a well-founded structure if R is a well-founded relation 
on A. 


Note that a well-founded relation must be asymmetric and hence irreflexive. 


Problem 808 (DC). A relation R on a set A is not well-founded if and only if 
X contains an “infinite sequence of R-descending elements,” that is, there is a 
sequence of elements X1,X2,...,Xn,°*+ © X such that Xn41RXy for alln EN. 


Clearly, every well-order is a well-founded relation. 


Problem 809. Show that the strict divisibility relation on N, defined by xRy <> x 
divides y and x # y, is well-founded. 


Problem 810 (Transfinite Induction on Well-Founded Structures). Let R be a 
well-founded relation on the set A, and B © A. Suppose that for any a € A, if 
{x € A| xRa} C B thena € B. Then B = A. 


Given a well-founded relation R on a set A, the elements of A can be classified into 
distinct ordinal ranks as follows: The R-minimal elements of A are said to have rank 
0. We then remove the elements of rank 0 from the set A to get the set A’, and the 
minimal elements of A’ are said to have rank 1. In general, using transfinite recursion 
on the ordinal a, we can define the elements of A of rank a to be the minimal 
elements of the subset obtained by removing from A all elements having rank < a. 
We can continue this process through the ordinals until the set A is exhausted. This 
procedure is readily formalized using the framework of abstract iterated derivatives 
and ranks (Theorem 714, Sect. 10.4), when we define the derivative operator V by 
V(E) := Exmine[E]. 


Theorem 811 (Canonical Decomposition of Well-Founded Relation). Let R be 
a well-founded relation on the set A. Then there is a unique ordinal jz and a unique 
partition (Ag | a < {L) of A into pairwise disjoint nonempty sets such that for every 
a < pL, Ay consists of the set of R-minimal elements of Ax p <a Ag, that is, 


Ay = Ming AxU Ag 
B<a 


Proof. The result follows directly from Theorem 714 when we define a derivative 
operator V: P(A) — P(A) by 


V(B) := Brming|[B], 
and define the sets A, as 


Ag = AM ~ACTY 
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where A“) denotes the a-th iterated derivative of A. In particular, A¢+) = V(A™) 
and AX AM = UgeqAg. Theorem 714 guarantees the existence of a unique 
least ordinal jz such that A“*+) = A), Since R is well-founded on A, the 
derivative V is strict, and so we have A“) = @. It follows that (Ag|a@ < ) = 
(AM SAG) | a < x) is a partition of A satisfying the condition of the theorem. 
Uniqueness of the partition follows by routine transfinite induction. Oo 


Note that in the framework of Theorem 714, the set Ay above consists precisely of 
the elements of rank a. Thus rank can also be defined in terms of the sets Ay as 
follows. 


Definition 812 (Ranks on Well-Founded Relations). Let R be a well-founded 
relation on A, and let (A,| @ < ) be the canonical decomposition of (A, R) as 
stated in the theorem. 


1. For each x € A, the rank of the element x, denoted by pr(x), is defined to be the 
unique ordinal v such that x € A,. 

2. The ordinal jz will be called the rank of the well-founded structure (A, R) and 
will be denoted by rankr (A). 

3. The mapping x +» pr(x) from A to the set W(y) of ordinals is called the 
canonical rank function for the well-founded structure (A, R). 


Remark. With V as the derivative operator defined by 
V(B) := Brming[B], 


the rank function pp in the above definition is same as the rank function p = py of 
Theorem 714 for the abstract derivative V. 


Problem 813. Show that if R is a well-founded relation on a set A having rank L, 
then the canonical rank function pr: A > W(L) is surjective, and hence we have: 


ju = rankr(A) = sup(pr(x) + 1). 
xX€A 


Problem 814. Show that for a well-founded structure (A, R), the canonical rank 
function pp satisfies: 


XRY => Pr(X) < pry) (for all x, y € A). 
Definition 815. Say that p is a rank function for a relation R ona set A (or p is 


rank function for (A, R)) if p maps A to a set of ordinals and p is strictly increasing, 
ie., if for any x, y € A, p(x) and p(y) are ordinals and 


xRy = p(x) < ply). 
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We say that a relation R on a set A admits a rank function if there is some rank 
function p for (A, R). 


The following is an important characterization of well-founded relations. 
Problem 816. A relation is well-founded if and only if it admits a rank function. 


By Theorem 811, for every well-founded relation one can effectively determine a 
rank function for it, namely the canonical rank function. The canonical rank function 
can itself be characterized as follows. 


Problem 817. Let R be a well-founded relation on a set A, and let p be any rank 
function for (A, R). Show that p equals the canonical rank function pr if and only if 


For everyx € A: p(x) = sup{p(v) + 1] y € A, yRx}, 


where we take sup ®@ = 0. 


If p,o are rank function for (A, R), we write p < o to denote p(x) < a(x) for all 
x € A. The following characterizes the canonical rank function as the unique “least 


7) 


one. 


Problem 818. Let R be a well-founded relation on a set A, and let pr be the 
canonical rank function for (A, R). Show that pr x p for any rank function p for 
(A, R). Conversely, if p* is a rank function for (A, R) such that p* < p for every 
rank function p for (A, R), then p* = pr. 


Problem 819. Let R be a relation on A, S be a well-founded relation on B, and let 
f:A— B be strictly increasing: xRy => f(x)Sf(y). Then R is well-founded on 
A, and the rank of (A, R) is at most the rank of (B, S). 


Definition 820. Let R be a relation on a set A, and x, y € A. 


1. x is an R-predecessor of y if xRy. 

2. B C Ais downward R-closed ifv € B,uRv>ue B. 

3. We say that x is an R-ancestor of y, and write xR, y, if every downward 
R-closed subset of A containing all R-predecessors of y also contains x. 


<< 
4. R[y] := {x| xR. y} denotes the set of all R-ancestors of y. 


Problem 821. Let R be a relation on a set A. Then xRxy if and only if there exist 
n > 2 and uy, u2,..., Un, such that u, = x, U, = y, and uz, Rug4, forl <k <n, 
and so R, is the smallest transitive relation containing R, i.e., Rx is the transitive 
closure of R. 


Problem 822. Let R be a relation on a set A. Show that Rx is well-founded on A 
if and only if R is well-founded on A. 


As a result we have the following strengthening of Problem 810. 
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Problem 823 (Strong Induction on Well-Founded Relations). Let R be a well- 
founded relation on the set A, and B © A. Suppose that for any a € A, if every 
R-ancestor of a is in B thena € B. Then B = A. 


Note that if R is well-founded on A, then for any a € A, R is well-founded on 
Rial, and so Ria] by itself becomes a well-founded structure under the relation R, 
which we may call the well-founded substructure consisting of the R-ancestors of 
a. The following useful proposition shows how the distinct rank functions on the 
parent structure and on the substructures are related. 


Proposition 824. Let (A, R) be a well-founded structure and let a € A. Then 


1. The canonical rank function PRial for the well-founded substructure R{al is the 
restriction of the canonical rank function pr for (A, R). 
2. pr(a) = rankr(R [a]), that is the canonical rank of a in (A, R) equals the rank 
of the substructure Rial. 
3. rankp(A) = sup (ranke(R [b]) + 1). 
beA 


Proof. The first part follows from Problem 817 by transfinite induction, using the 
fact that Rial is downward R-closed. 


Since pr(a) = sup{pr(x)| xRa} = SUPL PS 4) | xRa} = rankr(R [a]), the 
second part follows. 


The last part follow from the second part. Oo 


Problem 825 (Transfinite Induction for Well-Founded Structures). Let P be a 
property which satisfies the following condition: For any well-founded relation R 
on any set A, if every substructure Ria] (a € A) has property P, then (A, R) itself 
has property P. Then every well-founded structure has property P. 


The following problem gives an example of a well-founded relation whose 
inverse relation is also a nontrivial well-founded relation. 


Problem 826. Let X = P*(N) be the set of all finite subsets of N and define a 
relation P on X by the condition aPb if and only a & b and either a = © or 
mina > max(bxa). Let R = P™~ be the inverse relation of P. Show that 


1. aPb holds if and only if b can be partitioned as b = c Ua (ca = @), where 
c#@andx <yforallx €candy €a. 

2. Both (X, P) and (X, R) are well-founded strict posets. 

3. In (X,P), the element @ is the least element (hence the unique minimal 
element), and every singleton is an immediate P-successor of ©. 

4. Ifa € X has n elements (\a| = n), what is the P -rank of a? 

5. In (X, P), every element has finite P-rank, and for every n € N there is an 
element having P-rank n. 

6. There is no strictly P -increasing infinite sequence. 
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7. R is a well-founded relation on X, in which @ is the R-greatest element and 
every singleton is an immediate R-predecessor of ©. 
8. {1} is an R-minimal element in X. 
9. What are the R-predecessors of {2}? Of {3}? 
10. Draw a diagram showing the R-predecessors of {4} and how they are related 
by the relation R. 
11. What is the R-rank of {2}? Of {3}? Of {4}? 
12. What are the R-minimal elements in X ? 
13. For eachn € N find the R-rank of {n}. 
14. What is the R-rank of @? 
15. What is the rank of the well-founded relation {(X, R)? Of (X, P)? 


Problem 827. A positive integer is called square free if it is a product of distinct 
primes. We regard | as square free. If a € N is square free with 


a = 4ig2°"*n; where qi < q2 <+++Qn are increasing primes, 


then we say that qx is the k-th prime factor of a (k = 1,2,...,n). Let A be the 
set of those square free positive integers a such that, with q as the smallest prime 
factor of a, the total number of prime factors of a either does not exceed q or does 
not exceed the value of the q-th prime factor of a. In particular, 1 € A. Define a 
relation R on A by the condition 


aRb = a,b € Aandb is a proper divisor of a. 


1. Show that (A, R) is a well-founded structure, that is, the relation R is well- 
founded on A. 

2. Find the ranks of the elements 6, 10, and 21 

3. Characterize the R-minimal elements of A. 

4. Find the rank of the element \ and of the structure (A, R). 


11.5 Trees 


Definition 828. A poset (7, <) is called a tree if either T = @, or T has a least 
element root(7’) (called the root of T) and the set of predecessors of any element is 
well-ordered. 

The elements of a tree will often be referred to as nodes. 


If (T, <) is a tree, we will often refer to the underlying set T as the tree so long as 
the relation ~ can be understood from context. 
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Evidently, every well-order is a tree, and every tree is well-founded. The poset 
(X, P) from Problem 826 is tree. 


Definition 829. Let T be a tree under the relation <. 


1. We say that T’ C T is a subtree of T if T’ downward closed, i.e., if x € T’ and 
y~<ximplyy ET’. 

2. The height of an element x € T, denoted by htr (x) or ht(x), is the order type 
(ordinal) of the set {y € T | y ~ x} of predecessors of x. 

3. For any ordinal a, the a-th level of T, denoted by Lev,(7), is defined as the set 
of all elements of T with height a. (So x € Levy (T)  ht(x) = a.) 

4. Anode v € T is achild of anode u € T if v is an immediate successor of u, i.e., 
if u < v and ht(v) = ht(u) + 1. 

5. B is a branch of T if B is a chain and is downward closed, i.e., if B is linearly 
ordered subtree of T. 


Lev3(T \\ 


Lev2(T 


Lev,(T 


Levo(T 


A Tree T Drawn Growing Upward 


The following facts are immediate. 


Problem 830. Any subtree of a tree is a tree. If a tree has an element of height a, 
then it has elements of every height B < a. The levels of a tree are pairwise disjoint, 
and so form a partition of the tree. 


Problem 831. Show that for every tree T there is an ordinal a such that no element 
of T has height a. 


[Hint: If 7 is the Hartogs ordinal for P(T), then the levels Lev,(T), a < 7, are 
pairwise disjoint sets so all of them cannot be non empty.] 


Definition 832. The height of a tree T, denoted by ht(7), is the least ordinal w such 
that no element of T has height a. 


Definition 833. A tree 7 is said to be finitely branching if every node has at most 
finitely many children (immediate successors), i.e., if for any x € T, the set {y € 
T | x ~ y and ht(y) = ht(x) + 1} is finite. 
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Problem 834. The poset (X, P) from Problem 826 is tree. What is its height? Is it 
finitely branching ? 


Problem 835. Let (T, <) be a tree. Regarding it as a well-founded structure, show 
that the height of an element x, ht(x), is same as p<(x), the canonical ~-rank of 
x, and that the height of the tree, ht(T), equals rank.(T), the rank of the structure 
(T, <). 


Problem 836. A tree in which every level is finite must be finitely branching. Is the 
converse true? 


The Tree A* of Strings over a Set A 


The following example gives an especially important type of tree that will 
concern us. 


Example 837. Let A be a nonempty set and let A* denote the set of all strings 
(finite sequences) consisting of elements of A. Then A®* is a tree under the relation 
C, where u C v stands for “u is a (proper) initial prefix of v.” 

Two important special cases of this example are obtained by taking A = {0, 1}, 
giving us the full binary tree {0, 1}* where every node has exactly two immediate 
successors, and by taking A = N which gives a tree N* in which every node has 
infinitely many immediate successors. 


Problem 838. Consider the tree A* of all finite strings over A(A # @). Letk = 
|A| be the cardinality of A. Show that 


1. A* is a tree under the relation C with root ¢ and height w. 

2. The height of any element is its length (as a string). 

3. Every node in A* has k-many immediate successors, so A* is finitely branching 
if and only if A is finite. 

4. |Lev,(A*)| = x" for any finiten = 0,1,2,..., so if A is finite then every level 
of A* is finite. 

5. A branch in the tree A* is infinite if and only if it is maximal. 


Definition 839 (Trees over A). We say that T is a tree over A if T is a subtree of 
the tree A* of strings from A (under the string prefix relation C). 


We now obtain a “representation theorem” for trees of height at most w. 


Problem 840. Let (T, <) be a tree of height at most w. Then T is isomorphic to a 
tree over A for some A. That is, there is a set A, a subtree T' C A*, and a bijection 
f:T > T’ such that for allx,y € T, x ~ y & f(x) C f(y) (x precedes y in T 
if and only if f(x) is an initial prefix of f(y)). 

Moreover, if T is also countable, then one can take A to be the set N. 


Thus every countable tree of height < w is isomorphic to some tree over N. 
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Problem 841. Let (N,|} denote the poset of natural numbers under the (strict) 
divisibility relation |. Show that every countable tree of height < w can be embedded 
(as a poset) into (N, |). 


[Hint: By the previous problem, it suffices to embed N* into (N, |).] 


Although we are primarily interested in countable trees of height w, the 
representation theorem can be generalized to arbitrary trees as well. 


Problem 842. Given a set A and an ordinal a, let A<* denote the set of all 
functions whose domain is a proper initial segment of W(a) and whose range is 
contained in A. For u,v € A<*, let u C v if.and only if v extends u, that is, if there 
exist ordinals a < B such that dom(u) = W(a), dom(v) = W(f), andforally <a 
we have u(y) = v(y). Show that 


1. For any set A and any ordinal a, (A<*, C) is a tree of height a. 
2. Every tree is isomorphic to a subtree of (A<*,C), for some set A and some 
ordinal a. 


Remark. With the notation of the last problem, the tree A* of all finite sequences 
from A can be denoted by A<®. 


11.6 KG6nig’s Lemma and Well-Founded Trees 


KGnig’s Infinity Lemma 


Problem 843. Show that if T is a tree of height < w, then T is finitely branching if 
and only if every level of T is finite. 
Give an example of a finitely-branching tree with some infinite levels. 


Theorem 844 (The K6nig Infinity Lemma). Let T be a tree of height w in which 
every level is finite. Then T has an infinite branch. 


The result is often expressed by saying “every finitely branching infinite tree has an 
infinite branch.” 


Proof. Let (T, <) be a tree of height w in which every level is finite. 

Since T has height w so it has elements of height n for every n < w, and so T is 
infinite. For each x € T, let Succ(x) := {y| x ~ y orx = y}. Note that for any 
x € T and any n < a, we have 


x € Lev, (T) = Suce(x) = {x} U 'e {Succ(y)| x ~ y, y € Lev,41(T)}. 


Note that since every level of T is finite, so the big union above is actually a finite 
union. 
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Let xo be the least element of 7. Then Succ(xo) = 7, so Succ(x9) is infinite, 
with 


Suce(xo) = {xo} U 'e {Succ(y)| xo < y, y € Lev, (T)}. 


Since the big union above is finite while the left side is infinite, there must be at 
least one x, € Lev,(T) such that Succ(x;) is infinite. Fix such an x, (with Succ(x,) 
infinite). Then 


Succ(x1) = {x1} U @ {Suce(y) | x1 ~ y, y € Levo(T)}. 


Again, since the big union above is finite but the left side is infinite, we can fix 
X2 € Lev2(T) such that Succ(x2) is infinite. Continuing in this fashion, we get a 
sequence 


Xo < Xp ~ X2 ~ 00+ XX ~ Xp ~ 0 (Xn € Lev,(T)). 


Then B := {x,|n = 0,1,2,...} is an infinite branch through 7. Oo 


Well-Founded Trees 


Since every tree is well-founded, the term “well-founded tree” seems redundant. 
However, it has the following special meaning in the context of trees. 


Definition 845. A well-founded tree is a tree (T, P) such that the inverse relation 
R = P7! is well-founded on T. 


We saw an example of a well-founded tree in Problem 826. 


Problem 846 (DC). Show that a tree is well-founded if and only if it has no infinite 
branch. 


Definition 847 (Ranks in Well-Founded Trees). Let (7, P) be a well-founded 
tree, so that the inverse relation R = P! is well-founded on 7, and let PR be 
the canonical rank function for the well-founded structure (7, R). 

The rank of an element x € T is defined as pr(x). 

The rank of the tree T, denoted by rank(T), is defined as pr(r) where r = 
root(7’) is the P-least element of 7. We put rank(T) = 0 if T = @. 


Note on terminology. If (T, P) is a well-founded tree with the inverse relation R = 
P—' well-founded on T, then the term “height” applies to the relation P and the 
term “rank” applies to the relation R. E.g., the “rank of an element x” is the R-rank 
of x in the well-founded structure (7, R), while the “height of x” is the P-height of 
x in the tree (7, P). 


Problem 848. A well-founded tree has height at most w. 
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By Problems 840 and 848, every well-founded tree is isomorphic to a tree over A 
for some set A, so the study of well-founded trees can be limited to trees over (some 
set) A, i.e., to subtrees of A*. 


Problem 849. Define a subtree T of N* by 


T := {uEN*|u=€e or u= (um, u2,...,Un), n EN, andu > nt}. 


1. Show that (X, P) is a well-founded tree of rank w + 1. 
2. What is the rank of ¢? What is the rank of the element (7, 9, 2)? 


Problem 850. Give an example of a well-founded tree over N which has rank w* 
but height w. 


Problem 851. A nonempty well-founded tree T has finite rank if and only if it has 
finite height, and in this case ht(T) = rank(T) + 1. 


Problem 852. /f A is a finite set then any well-founded tree over A must be finite 
and so must have rank < w. 


Problem 853. If T C A®* is a well-founded tree over a set A, then rank(T) equals 
the rank of the well-founded structure (T ~{€}, D). 


T., 


T.,: Awell founded tree of rank w | T,..,: A well founded tree of rank w+w 


Problem 854 (Truncated Ranks). For a tree T over N and u € N*, put 
T :={veEN*|uxveT} (i.e, T™ is T truncated at u.) 


Then: 


1. T™ is a tree, and (T™)” = T“*”), If T is well-founded then so is T™ with 
rank(T“) < rank(T). 

2. The rank function on well-founded trees satisfies, and is the unique function 
satisfying, the recursion equation 
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rank(T) = supfrank(T"”) + 1| (n) € T,n EN}, with sup(@) := 0. 


3. IfT £ fe}, then T is well-founded of rank a if and only if Tis well-founded 
of rank < a for alln € N and for each & < a there is v € N* with len(v) > 0 
such that T™) has rank &. 


Existence of Well-Founded Trees of Every Rank. Our remaining task is to show 
that for every countable ordinal aw < @, there is countable well-founded tree (over 
N) having rank a. (Using the full Axiom of Choice, one can also show that for every 
ordinal a there is a well-founded tree of rank @.) 


Definition 855. If T C A* is atree over A anda € A, define: 


axT := {e} U {(a) xul ue Th 


= {e} U {(a, 1, u2,...,Un)| (ui, u2,...,Un) € T, n = 0} 


Problem 856. Show that 


1. If T © N* is a nonempty well-founded tree over N andn € N, then n x T is 
well-founded tree over N with 


rank(n * T) = rank(T) + 1. 


In particular, for every well-founded tree T C N* over N one can effectively find 
a well-founded tree T' © N* such that rank(T’) > rank(T). 
2. If (T,| n € N) is a sequence of well-founded trees over N, then 


Pia fey | net, 
neN 


is a well-founded tree over N, and if T,, 4 @ for some n, then 


rank(7') = sup (rank(T;,) + 1). 
neN 


Notice that every countable well-founded tree must have rank < w,. Using the last 
problem (and use of Choice), one obtains the following converse result. 


Problem 857 (CAC). Show that for every ordinal a < @, there is a well-founded 
tree over N having rank a. 


Finally, under the full Axiom of Choice one gets the existence of well-founded trees 
of every possible rank. 


Problem 858 (AC). Show that for every ordinal a, there is a well-founded tree of 
rank a. 
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11.7. Ramsey’s Theorem 


A popular puzzle says that in any group of six or more people there are three people 
who are mutual acquaintances or mutual strangers. Ramsey’s Theorem, which we 
prove below, says that in any infinite group of people there are infinitely many 
people who are mutual strangers or mutual acquaintances. 


Definition 859. For any set X andn € N, we let [X]" denote the family of n- 
element subsets of X: 


[X]" := {E| E CX and |E| =n}. 


If [X]" is partitioned into k sets as [X]"” = (ey X;, then a subset H C X is said 
to be homogeneous for the partition if [H]"” C X; for somei = 1,2,...,k. 

Similarly, if f:[X]" > {1,2,...,k} then H C X is said to be homogeneous for 
f if f is constant on [H]". 


The puzzle above can now be stated as follows: If |X| > 6 and if [X}’ is partitioned 
as [X]? = X; U X2 (with X¥; N X) = @) then there is H C X with |H| = 3 and 
[H|’ C X; for some i € {1,2} (i.e., H is homogeneous). 

We could state the result equivalently using functions: If |X| > 6 and f:[X]? > 
{1,2}, then there is H C X such that |H| = 3 and f is constant on [H]?, ie., 
Cx, y}) = f(C{u, v}) for all x, y,u,v € H, with x ~ y andu F v. 


Theorem 860 (Ramsey’s Theorem). /f X is an infinite set and f:[X}) > {1,2}, 
then some infinite H C X is homogeneous for f. 


Proof. We will use K6nig’s Infinity Lemma to prove the theorem. 

Without loss of generality we assume X = N, so that f:[N]? > {1, 2}. Also for 
each 2-element set {m,n} with m < n, we will write f(m,n) for f({m,n}). For 
each nonempty E CN, we put 


E” := {n € E|n > min(£) and f(min(£),n) = 1}, and 
E® := {n € E|n > min(£) and f(min(E£),n) = 2}. 


Then E”) and E®) are disjoint, and E = {min(E)} U E® U E®, so EY U E® 
contains all but one member of E. 

We now define a finitely branching tree 7 of height at most w consisting of 
nonempty subsets of N in which every node has at most two children, so that 
| Lev, (T)| < 2” for alln = 0,1,2,.... 

Let N be the root node of 7, and for each node E € T, we take each of E“ and 
E®), provided that it is nonempty, to be a child of E. Since the union of the children 
of a node contains all but one member of the node and since T is finitely branching, 
it follows by induction that the union of the nodes of level n contains all but finitely 
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many natural numbers. Hence Lev,,(7) 4 @ for alln = 0,1,2,..., and so T has 
height w. By K6nig’s Infinity Lemma, T has an infinite branch, say 


> Ey 2-2 Ep ») En4+1 D-- 

where Ey = N, and £,,+1 equals EY or E” for all n. Hence there is b € {1,2} 
such that F,4,; = E® for infinitely many n, and so there are natural numbers 
Ny <Ny <+++< ng <-++ with Ey, 4) = EW? for all k. Put ay := min(E,,,). Then 
fork < mwehaved, € E,,, © En,+1 = E®, hence f (ax, dm) = b. Therefore 
the set H = {a),a2,...} is homogeneous for f. Oo 


Definition 861 (Arrow Notation). If «, js are cardinals and n,k € N, we write 

Kk > (Wk 
to denote the statement “For any sets X with |X| = « and any function f: [X]" > 
{1,2,...,k}, there is a homogeneous H C X with |H| = jp.” 


Thus, Ramsey’s Theorem says that 8) —> (No)3. This is a special case of the 
following more general result. 


Theorem 862 (General Ramsey Theorem). For alln,k € N, we have: 
Ro > (Ro)f- 


We will not prove this result, but the reader may try it as a challenging exercise (it 
can be proved using induction on 7). 


A sequence (x,,) in an order or a partial order is monotone increasing if Xm < Xn 
for all m <_n, and it is strictly increasing if X» < X, for all m <_ n. Similar 
definitions are given for decreasing sequences. A sequence is monotone if it is either 
monotone increasing or monotone decreasing. 


Problem 863. Use the General Ramsey Theorem to show that every infinite 
sequence in a linear order has a subsequence which is either strictly decreasing 
or Strictly increasing or constant. Conclude that every infinite sequence in a linear 
order has a monotone subsequence. 


[Hint: For a sequence (x,) in an order define f:[N]’ — {1,2, 3} by setting, for 
m <n, f(m,n) := Lif Xm < X%y_,1= 2if Xm > Xp, and := 3 if x» = Xy.] 


Problem 864. Every infinite sequence in a partial order has a subsequence which 
is either monotone or consists of pairwise incomparable elements. 


Problem 865. Show that 2% A (81)3. Hence &; 7% (&1)5. 


[Hint: If 280 —> (&1)5, argue as in Problem 863 to show that every linear order of 
size 20 has a suborder of type w, or of type *w;, a contradiction. ] 
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Problem 866. Show that 2®« A (®q+1)3- 
[{Hint: Use Problem 796. ] 


Remark. The General Ramsey Theorem, for each n > 3, is closely related to 
KG6nig’s Infinity Lemma. In a sense, each can be “easily derived” from the other 
without using any other “strong” theorems. This vague statement is made precise 
in an area of mathematical logic known as reverse mathematics, where strengths of 
mathematical statements are studied relative to weaker base subsystems. See [75] 
for more details. 


Chapter 12 
Postscript I: Infinitary Combinatorics 


Abstract The topics of the last chapter (Chap. 11) naturally lead to the area of 
Infinitary Combinatorics, which is beyond the scope of this text. This postscript to 
Part II is intended to be a link for the reader to begin further study in the area. We 
indicate how the obvious generalizations of three separate topics of the last chapter, 
namely short orders, KOnig’s Infinity Lemma, and Ramsey’s Theorem, converge 
naturally to the notion of a weakly compact cardinal, an example of a large cardinal. 
In addition, it is shown how Suslin’s Problem is equivalent to the existence of 
Suslin trees. Finally, we briefly mention Martin’s Axiom and Jensen’s Diamond 
principle ©, and their implications for the Suslin Hypothesis. 


Note: Throughout this postscript we will assume the Axiom of Choice without 
explicitly mentioning it. 


12.1 Weakly Compact Cardinals 


An interesting property of (linear) orders is that any sequence in an order has a 
monotone subsequence. This can be proved in various ways, e.g., Problem 863 
derived it from Ramsey’s Theorem. The property can be stated equivalently as: Any 
order of size &o has a suborder of order type w or *w. 

On the other hand R is a short linear order (i.e., R has no subset of type @, or 
*w,) and &; < |R|, so we have the contrasting fact: There are orders of size ®, 
which do not have any suborder of order type w, or *@. 


Definition 867. We will say that an infinite cardinal k = X, has the monotone 
order property! if every linear order of cardinality 8, has a suborder of type wy or 


“Wo. 


'This terminology is not a standard one. 
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Thus Xo has the monotone order property but &; does not. More generally: 


Proposition 868. An infinite cardinal with the monotone order property is a strong 
limit. Hence, no successor cardinal has the monotone order property. 


Proof. Suppose that k = &, has the monotone order property. If we had 28* > « = 
XN, for some B < aq, then the lexicographic power {0, 1}”@6) would have a suborder 
of type Wy or *w, and hence also a suborder of type wg+1 or *wg+1 (as B +1 <a), 
contradicting Problem 796. oO 


We now have: 


Theorem 869. Any uncountable cardinal k = &q having the monotone order 
property is regular, and therefore strongly inaccessible. 


Proof. Otherwise, we would get y = Ye <p Sa Where B <a and ag < a for 
all § < B. For each € < fix an order X¢ of order type *wo,, and let X := 
U e<p(t&} x X¢) be equipped with the lexicographic order. Then X does not contain 
any subset of order type w, or *wy, but |X| = Ny. Oo 


A similar property of cardinals can be generalized from Ramsey’s Theorem. We saw 
that %) > (No)3 (Ramsey’s Theorem), while 8; 4 (81)3- (Problem 865). As in the 
case for the monotone order property, this immediately implies that if k > (k)5 
then « must be a strong limit: 


Proposition 870. Let « be an infinite cardinal satisfying x — (k)3. Then k is a 
strong limit. In particular, k cannot be a successor cardinal. 


Proof. Essentially same as that for Proposition 868, but use Problem 866. oO 


Theorem 871. /f « is uncountable and x — (k)3, then k is regular and therefore 
strongly inaccessible. 


Proof. If k were singular, we would get a partition of the form X = ;<,; X; where 
|X| = x, |I| < x, and |X;| < « for alli € J. Define f:[X]’ — {1, 2} by setting 
Fx, y}) := lif x,y € X; forsomei € J, and := 2 otherwise. A homogeneous 
set for the partition would then yield a contradiction. Oo 


The third and last property of cardinals that we will consider is generalized from 
KG6nig’s Infinity Lemma. 


Definition 872. A cardinal k = &, is said to have the tree property if any tree T 
of height w, in which each level has cardinality < « has a branch of height wy 
(a branch of height wy can be equivalently described as a chain C C T such that 
Lev;(T) NC F for all & < ay). 


K6nig’s Infinity Lemma is the assertion that Xo has the tree property. A result of 
Aronszajn says that 8; does not have the tree property. This means there is a tree 
T of height @ in which all levels are countable yet T has no branch of height @). 
Such trees are known as Aronszajn trees. 
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However, we cannot prove that a cardinal having the tree property must be 
inaccessible. Still, the following important result shows how closely the three 
properties just considered are related. 


Theorem 873. For any uncountable cardinal k, the following are equivalent: 


1. « has the monotone order property. 
2. k > (k)5. 
3. K is strongly inaccessible and has the tree property. 


For a proof, see [14] or [41]. 
Problem 874. Prove that 2 implies 1 in the above theorem. 


Cardinals which satisfy any (and so all) of the conditions of Theorem 873 can also 
be characterized by a metamathematical property of infinitary languages known as 
“weak compactness,” and so are called weakly compact cardinals. 


Definition 875 (Weakly Compact Cardinals). A cardinal is said to be weakly 
compact if it satisfies any of the conditions of Theorem 873. 


A lot more can be said about weakly compact cardinals. For example, they are 
not only strongly inaccessible, but also preceded by an equal number of strong 
inaccessibles. In fact, if k is weakly compact, then there are arbitrarily large 
cardinals 8, < « such that &, is the w,-th strongly inaccessible cardinal. For more 
on weakly compact cardinals, see [14,37]. 

The notion of a weakly compact cardinal is thus a good example of a large 
cardinal. As mentioned before, such cardinals, being at least inaccessible, cannot 
be shown to exist using the standard axioms of set theory (assuming these axioms 
are consistent) due to a result known as Gédel’s second incompleteness theorem. 
Asserting the existence of a large cardinal is therefore known as a large cardinal 
hypothesis or an axiom of strong infinity. As we will see later, certain large cardinal 
hypotheses, such as that of the existence of a measurable cardinal, can have 
significant implications for ordinary mathematics. 


12.2 Suslin’s Problem, Martin’s Axiom, and 


Recall Suslin’s Problem, which asks: 


The Suslin Problem. Is every CCC continuum without endpoints necessarily order 
isomorphic to R? 


The mechanism of trees can be used to put Suslin’s Problem in a more useful 
combinatorial form. Note that a negative answer to Suslin’s Problem amounts to the 
existence of a linear continuum which is CCC but not separable. Such a continuum 
is called a Suslin line. 
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Recall that an Aronszajn tree is a tree of height w,; in which all levels and all 
chains are countable. Aronszajn proved that such trees exist. We now consider a 
stronger property: A tree is called a Suslin tree if it is an Aronszajn tree in which 
there are no uncountable antichains. 


Definition 876 (Suslin Lines and Trees). A Suslin line is a linear continuum 
which is CCC but not separable. A Suslin tree is a tree of height @ in which all 
chains and all antichains are countable. 


The following result reduces Suslin’s Problem to a question about trees. 
Theorem 877. There is a Suslin line if and only if there is a Suslin tree. 


To outline a proof of Theorem 877, we need some definitions and lemmas. If u is a 
node in a tree 7, we will use the notation Succ? (u) to denote the extensions of u of 
height a, i.e., Succ? (u) := {v € Leva (T)| u X v}. 


Definition 878. A Suslin tree T is normal if every node has extensions in all higher 
levels below @, i.e., Succ, (u) # @ for every w with ht(u) <a < a. 


Lemma 879. Let T be a Suslin tree and let 
Tt := {ue T| Succ! (u) 4 @ for every a with ht(u) < a < a}. 


Then T+ is anormal Suslin tree. 


Proof. T* is anonempty subtree of 7’, and it suffices to show that foreach u € TT, 
T+ M SuceZ (u) 4 @ for all a with ht(u) < a < @. 

Suppose T+ M SuceZ (uw) = @ with wu ¢ T+ and ht(u) < @ < a. Then for 
each v € Succ? (u), there is a, with a < a, < @, such that Succj, (v) = @. As 
Succ? (u) is countable, we may fix 6 < a with B > ay, forall v € Succ? (u). But 
then Succes (w) =U T (wy) Succy (v) = @, a contradiction. Oo 


veSuccy 


For u,v € T, we write u ~ v & uand v have the same set of predecessors. Then ~ 
is an equivalence relation on T such that each equivalence class [u]~ is contained in 
some single level of T. 


Lemma 880. /f there is a Suslin tree then there is a normal Suslin tree in which 
[u]~ is infinite for all u with ht(u) > 0. 


Proof. Let T be a Suslin tree, which we assume to be normal by the last lemma. 
Note that if ht(u) < @ < B < a then 1 < | Succ? (u)| < | Succs (w)|. Also, if 
u € T then {v € T| u < v} cannot be a chain, and in fact | Succ? (u)| > 2 for 
some a > ht(u). Repeating the process infinitely many times, we see that there is 
B > ht(u) such that Succp (u) is infinite. More generally, for any countable subset 
C C T, we can get 6 such that Succ» (u) is infinite for all u € C. Now define 
increasing ordinals (a(&)| € < @1) as follows: Let a(0) = 0, and for € > 0 let 
56 := sup{a(n) | n < &} and put a(&) := the least 8B > 6 such that Succ} (u) is 
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infinite for all u € Levs(T). Finally, T’ := U: <o, L€Va(e)(T) with the inherited 
order is a tree with the required property. Oo 


Proof (of Theorem 877, Outline). Suppose first that there is a Suslin line, i.e., a 
linear continuum L without endpoints which is CCC but not separable. We define a 
@ -sequence of nonempty open intervals ((de, bw) | @ < @1) (with dg < by) using 
transfinite recursion: For each a < q, the countable set Ey := {ag,bg| B < a} is 
not dense, so there are c < d with (c,d) N Ey # @, and therefore by density, we 
can choose and fix dy, bg such that c < dy < by < d. Let T := {L}\){(do, da) | 
a < @,}, and order T by reverse inclusion. Then T is a Suslin tree. 

Conversely, suppose that T is a Suslin tree. By Lemma 880, we may assume that 
T is anormal Suslin tree in which [v]~ is infinite for all u with ht(u) > 0. For each 
equivalence class [u]~ = [u], fix an order <j, on [u] of order type 7. Now define a 
linear order <, on all of T by setting x <, y if and only if either x ~ y in T, or 
there exist u < x andv < y with u ~ v and u <j, v. Finally, take the Dedekind 
completion of the order <, to get a Suslin line. oO 


Martin’s Axiom 


Recall that the affirmative answer to Suslin’s Problem is called the Suslin Hypothesis 
or SH. Equivalently, SH is the statement that there is no Suslin line. SH cannot be 
settled one way or the other using the standard axioms of set theory. The Continuum 
Hypothesis (CH) cannot decide SH either. However, an important combinatorial 
principle called Martin’s Axiom (MA) implies that if CH fails then SH must be true, 
i.e., MA + not-CH => SH. 


Definition 881. Let (P, <) be a poset (partial order). 


1. (P, X) is said to satisfy the countable chain condition (CCC) if every antichain 
in P is countable. 

2. Asubset D C P 1s called dense if for all u € P there is v € D with vy ~ u. 

3. A nonempty subset G C P is called filter if u €¢ G andu <x v>ve G (Gis 
upward closed), and for all u,v € G there is w € G with w < uand w < v (Gis 
downward directed). 


Martin’s Axiom (MA). Martin’s Axiom (MA) says: “If (P, <) is a CCC partial 
order and D is a family of dense subsets of P with |D| < 28° then there is a filter 
G such thatG 1 D # @ forall D € D.” 


Martin’s Axiom is an immediate consequence of CH, and so is of interest only when 
CH fails. (If CH fails, then MA implies that cardinals strictly between No and 2% 
have many of the properties of Xo.) 


Proposition 882. CH > MA. 
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[Hint: Let (P, <) be a CCC partial order and let D be a family of dense subsets of 
P with |D| < 2%. By CH, D is countable and can be enumerated, say as D = 
{D,, D2,...}. Fix d; € Dj, and use density to inductively pick d+; € Dy+, such 
that d,+1 < d,. Then put G = {u| d, < u for some n}.] 


Theorem 883. MA + not-CH => SH. 


Proof. Assume MA + not-CH. To get a contradiction, suppose that there is a Suslin 
tree 7. By Lemma 879 we may assume that T is a normal Suslin tree. 

Consider T as a poset with the reverse order of T, with the root as the largest 
element. Since T is a Suslin tree, T has CCC. Now for each a < @, put Dy := 
Usse Levg(T). Since T is normal, each Dy is dense. Since we are assuming that 
CH is false, |{Da | @ < @}| < 28°. Hence by MA, there is a filter G with GN D, # 
@ for all w < @,. But then G must be a branch of height @, through 7, contradicting 
that T is a Suslin tree. oO 


The condition MA + not-CH has been shown to be consistent with the standard 
axioms of set theory. Therefore, by the above theorem, we can consistently assume 
that there are no Suslin lines. 


Jensen’s Diamond Principle } 


Another important combinatorial principle is the Diamond Principle due to Jensen, 
which is denoted by the symbol ¢. 


The Diamond Principle. ¢ says: “There are sets Ag, @ < @, such that for all 
A,C C W(a@)) if C is aclub set then AN W(a) = Ag for some a € C.” 


CH is an immediate consequence of ¢. Also, } has been shown to be consistent with 
the standard axioms of set theory (since it follows from the axiom of constructibility 
devised by Gédel). One important application of > is the following result, which we 
state without proof. 


Theorem 884. © => not-SH. 


It follows that the negation of the Suslin Hypothesis (not-SH) is also consistent 
with the standard axioms of set theory. Combined with the consistency of SH, this 
means that SH cannot be settled using the usual axioms. In other words, neither SH 
nor its negation can be derived from the standard axioms: The Suslin Hypothesis is 
independent of the standard axioms of set theory, assuming that these axioms are 
themselves consistent. 

MA and ¢ have many interesting properties and applications, but we conclude 
our brief discussion of infinitary combinatorics and refer the reader to some texts 
for further study. 

Good introductions to infinitary combinatorics can be found in [14, 35, 41, 48]. 
For more advanced treatments see [34, 37, 44]. 


Part HI 
Real Point Sets 


Introduction to Part III 


This part focuses exclusively on the real line R. Cantor’s work not only gave birth 
to the theory of transfinite, but was also instrumental in the development of point set 
topology, which, roughly speaking, is the study of limits and continuity in a general 
setting. Topological notions such as closed sets, dense-in-itself sets, and perfect sets 
were first introduced by Cantor. 

The opening chapter, much of which is very elementary, introduces base 
representation via interval trees, Cantor systems, and generalized Cantor sets. The 
next chapter deals with basic topology of the real line. 

The material of the chapter on Heine—Borel and Baire-Category Theorems is 
often called “measure and category.” It is shown that Gs; sets satisfy the Continuum 
Hypothesis, and that perfect sets have cardinality c. 

The chapters on Cantor—Bendixson analysis and on Brouwer’s and Sierpinski’s 
Theorems are somewhat more special. An application of the ordinals is illustrated 
by the method of Cantor—Bendixson analysis, giving a complete enumeration of 
the &, distinct “homeomorphism types” of countable compact sets. The proofs 
of Brouwer’s and Sierpinski’s Theorems given here illustrate how the Cantor— 
Dedekind theory of order can be used to give somewhat elementary proofs of some 
relatively advanced topological results. 

The chapter on Borel and analytic sets touches on the rudiments of descriptive 
set theory, and proves that the analytic sets have the perfect set property—the best 
possible result that can be proved using the usual axioms of set theory. They are 
also shown to be Lebesgue measurable (and having the Baire property) using the 
Ulam matrix decomposition for coanalytic sets. To obtain a non-Borel analytic set, 
a direct effective proof of the boundedness theorem for the set of codes of well- 
founded trees is given (since with no access to product spaces, the standard method 
of diagonalizing universal sets cannot be used). 

The postscript chapter for this part gives a detailed account of Ulam’s analysis 
of the measure problem leading to the notion of measurable cardinals, and a brief 
discussion of Lusin’s problem for the projective sets. 
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Chapter 13 
Interval Trees and Generalized Cantor Sets 


Abstract This elementary chapter applies the nested intervals theorem to obtain 
base expansion of real numbers via trees of uniformly subdivided nested closed 
intervals, with detailed illustrations for ternary expansions. The construction of the 
Cantor set is then generalized to Cantor systems (systems of nested intervals indexed 
by binary trees), to formally introduce generalized Cantor sets. 


13.1 Intervals, Sup, and Inf 


Definition 885. An interval is a set having one of the forms 
(a,b), [a,b], (a,b), (a,b), (a, 00), [a, 00), (—00, b), (—00, db], R, @. 
An open interval is an interval having one of the forms 
(a,b), (a, 00), (—oo,b), R, @. 
A closed interval is an interval having one of the forms 
[a,b], [a,00), (—o0, b], R, @. 


An interval is proper if it contains at least two points. 


Note that each of @ and R is both an open interval and a closed interval. Moreover 
every singleton set {a} = [a, a] is a closed interval (improper). 


Definition 886 (Bounds, Sup, and Inf). Let A C R. 


1. uw € Ris an upper bound of A, written asu > Aor A < u,ifu > x forall x € A. 
2. u € Risa strict upper bound of A, written as u > Aor A < u, if u > x for all 
xeA. 
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3. A is bounded if there exist a,b € R such thata < A <b. 

4. p € Risa least upper bound or supremum of A if p is an upper bound of A and 
no point g < p is an upper bound of A. If a least upper bound of A exists, it will 
be unique and will be denoted by sup A. We will also use the notation 


sup f(x) 


x€A 


to denote sup f [A]. 
5. We write u = max A if A < uandu € A, that is when u is the greatest element 
of A. 


Lower bounds (greatest), infimum, inf A, min A, etc, are similarly defined. 


Recall (from our definitions of these notions for orders) that a = sup A if and only 
if either a = max A ora > A anda is an upper limit point of A. Moreover, since 
R is a complete order, for every nonempty set A if A is bounded above then sup A 
exists, and if A is bounded below then inf A exists. 


The Nested Intervals Theorem in R 


Recall that the completeness of R implies the sequential nested interval property. 
The following variant (and immediate consequence) of that result will be used 
frequently in this part of the book. 


Theorem 887 (Nested Intervals Theorem in R). Suppose that 


Dh2-+Dh2Ih4i2- 


is a sequence of nested intervals satisfying the following conditions. 


1. Each interval is nonempty and closed, where we allow the closed intervals to be 
unbounded, having the form |a, 00), or (—o0, a], or (—00, 00.) = R. 

2. lim len(/,) = 0, that is for any € > 0 there is k such that \len(I,) < € for 
noo 


n>k. 


Then the intersection of the intervals is a singleton, 


() I, = {a} for somea ER. 


Note that while we are allowing unbounded closed intervals in the nested sequence, 
the second condition implies that at most finitely many of those intervals can be 
unbounded. 
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Problem 888. Show that the second condition in the theorem above cannot be 
dropped. 


Problem 889. Show that we cannot replace “closed” by “open” in the above 
theorem. 


Problem 890. Show that the above theorem remains valid if we replace “closed” 
by “open” and assume that inf I, < inf [41 and sup I,41 < sup Ip. 


13.2 Interval Subdivision Trees 


In this section we explain how the familiar method of decimal expansions of 
numbers in the interval [0, 1] naturally leads to iterated subdivisions of intervals 
forming a tree structure. Instead of the base 10 (decimal) system, one can use any 
fixed base. 


Using a Fixed Base b 


Given a positive integer b > 2 (the base) and a closed interval J, we subdivide J into 
b equal closed subintervals each of length i len(/) and write these subintervals as: 


[0], Z[l], ..., Z[b—1] (based). 


Thus J [d] is the d-th subinterval in this subdivision of J into b equal subintervals, 
ford = 0,1,...,b—1.! 

The process is then further iterated as follows. If d; and d> are two b-ary digits 
(ie., d1,d2 € {0,1,...,b — 1}), then we let I[dd,] denote the dz-th subinterval 
of I [d,],. Thus in the first stage J is subdivided into b equal subinterval / [0], 7[1], 
..., 1 [b — 1], and then each of these b subintervals /[d,] is further subdivided into 
b more smaller sub-subintervals /[d;0], I[d,1], ... [[d1(b — 1)], giving a total of b? 
sub-subintervals at the second stage. 

We can continue iterating the process, giving b” intervals at stage n. 


‘Our notation is ambiguous since J[1] could denote the second sub interval in two, or three, or 
ten (or any other numbers) equal subdivisions of 7. It would be more correct, but more clumsy, to 
write 1;[0], Zp[1],..., I,[b — 1] in place of J [0], Z[1],..., I[b — 1]. Since the base b is generally 
fixed throughout a situation, it is understood from context, and dropping the subscript b does not 
cause any confusion. 
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The Ternary Subdivision Tree 


We illustrate the system and notation for the specific case where b = 3 (the ternary 
system) and J = (0, 1], the unit interval. The general case of any b > 2 is so similar 
that we will not discuss it separately. 


The initial three ternary subdivisions of J = [0, 1] are: 

7(0] = [0,3], 7J=[4.4], and 7[2]=[5,1] — (base b = 3). 
These three intervals each have length i Then each of these is further subdivided 
into three equal sub-subintervals, giving a total of nine sub-subintervals, each of 
length i: 


[00], Z[O1], Z[O2]; 7[10], 7[11], 712]; 7[20], 7[21], 7[22], 


where J [00] = [0, 4], /[01] = [4. 3], etc, with /[22] = [8, 1]. 


By 2 4 3 7 8 
| 700] ) TfOl] | 702] | [lo] y T(t] ) Tz | 720) ) Tt) 722] | 
: 1(0] i an z TP] : 
« I = [0, 1] a 


This process is further continued to obtain 9 x 3 = 27 sub-sub-subintervals each 
of length 1/27, denoted by /[000], 7[001], 7 [002], [010], ..., 7[222]. 


Regarding subintervals as descendents of intervals containing them, the entire 
systems can be arranged in the form of a tree: 


I 
(0, 1] 
(0) 1] 1(2| 
(0. 3] 303 [3.1] 
Z[O0] 701] [02] [10] Z[11] [12] [20] 7[21] 722] 
3) 63) Gl G3) Bal Bl 63) B81 Ba 


In general, at stage n, there will be 3” subintervals of the form J [d,d... dy]. 
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Ternary Strings 


Notice how at stage n, the 3” intervals of the form I[d)d2...d,] are indexed by 
ternary strings d;d2...d, formed out of the three “letters” from the set {0, 1, 2}, 
which is called the ternary alphabet. The finite ternary strings themselves are of 
various lengths: 


€, 0, 1, 2, 00, O1, 02, 10, 11, 12, 20, 21, 22, 000, 001, 002, 100, ... 


Here ¢ is the empty string (of length zero), and there are 3” ternary strings of 
length n. 

The finite ternary strings themselves are naturally arranged in an infinite tree by 
regarding string prefixes (i.e., initial segments) as ancestors: 


Eg 
0 1 2 
00 01 02 10 11 12 20 21 22 
aM™m 2T™ 2T™ ~TM™ atm 2T 2T™ OAT. 


Thus the system of ternary subdivision of intervals, when arranged by the relation 
of containment of intervals, is naturally mapped by the tree of ternary words 
arranged by the relation of prefixes of words. Clearly, this mapping is a one-to- 
one correspondence between the nodes which transforms string prefix relations into 
interval containment relations. We thus have a natural representation of the ternary 
interval tree by the tree of finite ternary strings. 


13.3 Infinite Branches Through Trees 


An infinite branch through the above ternary tree is an infinite set of nodes (i.e., 
strings) 


{é, dy, did, d,dod3, hats didy...dn, saat 
containing exactly one node of each length n = 0,1,2,... in which the shortest 


node ¢ of length O is the root (“the branch starts at the root”), and each node 
d,dz...d, of length n is obtained from the previous node d,d2...d,—; length 
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n — | by appending a single digit d,,. Note that we usually draw trees “growing 
downward,” so the branch “grows downward” in the tree as well. 


Infinite Branches as Infinite Digit Strings 


The infinite branch above, namely e, d;, djd2, djdod3,..., can be identified with 
the infinite ternary sequence d,d2---d, --- € {0, 1,2}, since by taking finite initial 
prefixes of an infinite string, each branch through the tree is represented uniquely 
by a single infinite ternary string. 

For example, the initial prefixes of the constant infinite ternary string 000000: - - 
are €, 0, 00, 000, etc, so the infinite ternary string 000000: -- represents the leftmost 
branch of the ternary tree, while the infinite string 111111--- represents the center- 
most string going straight down right through the middle of the tree. As another 
example, the bold segments in the following figure illustrates a “zigzag” infinite 
branch represented by the infinite string 0202020202:.--. 


é 
0 1 2 
00 01 02 
020 021 022 
0202 
02020 
IN 


020202 
So by taking the finite initial prefixes of any given infinite ternary string, we get a 
set of nodes growing down through the ternary tree forming an infinite branch. And 
conversely, any infinite branch through the ternary tree produces an infinite ternary 
string as its “limit.” 
Thus, we have a natural one-to-one correspondence between infinite ternary 
strings and the infinite branches through the ternary tree. 
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Infinite Branches as Nested Intervals 


Say that a sequence of bounded closed intervals is a nested ternary sequence if the 
first interval in the sequence is the unit interval, and each succeeding interval equals 
either the left-third, the middle-third, or the right-third closed subinterval of the 
preceding interval in the sequence. Note that an infinite branch through the ternary 
tree determines a nested ternary sequence of intervals. 

Specifically, given an infinite ternary string djd2d3... (with d, € {0, 1,2} for 
all 1), the intervals indexed by initial prefixes of d;dd3... form the nested ternary 
sequence of intervals 


(0, 1] = Ze] 2 T[di] 2 I[did2] DB +++ D T[didy-++dn] 2 


We thus also have a natural one-to-one correspondence of infinite ternary strings 
with nested ternary sequences of intervals, via infinite branches through the ternary 
tree. This is the basis of ternary expansions. 


Ternary Expansions 


In the nested ternary sequence of intervals displayed above, we have len(J/[e]) = 1, 
len [di]) = i, len( [di do]) = $. and so on, so that len(/ [di d2---dn]) = x —>0 
as n — oo. Hence by the Nested Interval Property the above sequence of nested 


intervals must contain a unique real number in their intersection, 1.e., 
CO 
() I[d\dz---d,] = {x}, for a unique x € R. 
n=1 


In this case, we say that the infinite ternary string d\dd3--- represents the real 
number x, or that x has a ternary expansion d\d d3---. This is also expressed as: 


x =0-d,dod3---d,--- in ternary expansion. 


Thus every infinite ternary string determines a unique real number in [0, 1] via the 
nested intervals associated with the initial prefixes of the infinite string. Conversely, 
we have: 


Theorem 891. Any real number in (0, 1] has a ternary expansion. 
The proofs of the theorems above and below are left as exercises. 


Theorem 892 (Ternary Expansion in [0, 1]). For each finite ternary string u € 
{0, 1,2}* let I[u] denote the corresponding iterated subinterval in the system of 
ternary subdivision over [0, 1]. Let (d,)P2, = didz-++dy+++ be an infinite ternary 
string, and put [ay, by] := I[d\d2--- dhl. 
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Then for any x € [0, 1], the conditions below are equivalent to each other: 
I. x =0-dd)---d, +++ in ternary expansion. 
Co 


2 () I[didy--- dy] = {x}, ie, x is the unique member of the intersection of the 


n=1 


nested intervals Ie] > [di] D> I|did2] D--- D I[didz---dy| D---. 


See ae Dae (infinite series expansion). 


n=1 


4.x = lima, _ (limit of the left endpoint of the nested intervals). 
n>Co 
5. x 


II 


lim b, (limit of the right endpoint of the nested intervals). 
noo 


A similar result is true for any fixed base b. 
Problem 893. Find the values of the following infinite series expansions: 


O-111111--- (ternary) 
-111111--- (binary) 
-111111--- (decimal) 

+ 1022222222 --- (ternary) 
- 1100000000 - -- (ternary) 
0 - 02020202 --- (ternary) 


AwMRWND 


[Hint: Use the fact that the sum of a convergent geometric series a+ ar +ar7+--- 
isa/(1 —r), where |r| < 1.] 


Problem 894. Let b € {2,3,4,...} be a fixed base. A rational number x is said to 
be b-adic if x = m/b" for some m € Zandn €N. We use the terms dyadic for 
2-adic and triadic for 3-adic. 


1. Prove that a real number in [0,1] has a ternary expansion which is eventually 
constant to a digit value of 0 or 2 if and only if it is a triadic rational in [0, 1]. 

2. Formulate and prove similar results for the binary (base b = 2) and the decimal 
(base b = 10) systems. 

3. Prove that 0 < x < 1 has multiple ternary expansions if and only if x is a triadic 
rational. More specifically, show that every triadic rational x, 0 < x < 1, has 
exactly two ternary expansions, and every other real in [0, 1] has a unique ternary 
expansion, 


Problem 895. An infinite digit sequence d\d---d,--- is said to be repeating if 
there is finite block of digits which eventually keeps repeating (formally: if there is 
r € Nandk EN such that dy4, = d, for alln > k). 


1. Prove that0 < x < 1 has a repeating ternary expansion if and only if x is a 
rational number in [0, 1] 
2. Prove that the same result is true for any fixed base b € {2,3,4,...}. 


Problem 896. Show that the Cantor set consists of those reals in the unit interval 
which admit a ternary expansion in which the digit | does not occur. 
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Problem 897. Show that i is a member of the Cantor set, but 8e/71 is not. Give 
an example of an irrational member of the Cantor set. 


Problem 898. Show that every real number in [0,2] can be expressed as a sum of 
two members of the Cantor set. 


[Hint: First show that every real in [0, 1] can be expressed as the sum of two reals in 
[0, 1] each of which has a ternary expansion not containing the digit 2.] 


13.4 Cantor Systems and Generalized Cantor Sets 


The following definition is a direct generalization of the binary tree of intervals that 
was used in the construction of the Cantor set. 


Definition 899 (Cantor Systems). A family J = (J,,| u € {0, 1}*) of sets indexed 
by the binary tree {0, 1}* is called a Cantor System if for each binary string u € 
{0, 1}*: 


1. J, is a bounded proper closed interval, i.e., J, = [a,b] for some a < b; 
2: Juco. Jur Cc Jus 

3. Juco N Jur = Q; 

4. For any infinite binary sequence b = by by--+b, «++ € {0, 1}N, 


lim len(Jp),) = lim len(Jp,5,...5,) = 0. 
noo noo 


Note that the notation b|n denotes the finite initial prefix of the infinite sequence b 
consisting of its first n entries, i.e., b|n := b,b2---b,. Also, recall that the notation 
u~d denotes the string which is obtained by appending the string u with the digit d, 
so that len(u~d) = len(u) + 1. 


Let (J,,| u € {0, 1}*) be a Cantor system. If b = byby-++b,-+» € {0, 1}N is an 
infinite binary sequence, then Jp, > Jp,5. D Jbi:.b; ++: forms a nested sequence 
of nonempty closed intervals whose lengths approach zero, and so their intersection 
must be a singleton. Hence each infinite binary sequence b = byby---by,-++ € 


{0, 1} determines a unique real number x; such that 


{x0} = (| Jotn- 


Moreover, note that distinct infinite binary sequences determine distinct real 
numbers: If b = b,bo---by--- and c = c,C2-+-Cy-++ are distinct infinite binary 
sequences, then there exists a least k such that by #A cx. Then Jj,5,..p, and 
Jereo-cy ate disjoint subintervals of Jp,5,.-r,-, = Jeyey-cy_1> 80 Xb F Xc. By setting 
g(b) := xp, we get an injective mapping g: {0, 1}N > R. Thus every Cantor system 
J = (Jy| u € {0, 1}*) effectively determines a unique one-to-one mapping ~ from 
£0, 1}N into the reals such that 
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for all b € {0, 1}%: {9(b)} =} Join. 
n 


The set of all real numbers x, = y(b) as b ranges over all possible infinite binary 
sequences, that is the range of the function @, is called the set generated by the 
Cantor system (J, | u € {0, 1}*). 


Definition 900 (Set Generated by a Cantor System). The set generated by the 
Cantor system J = (J,| u € {0,1}*) is the set P of real numbers defined by the 
condition: 


x €P ifandonlyif there exists b € {0, 18 a such that x € () Join: 


n 


Definition 901 (Generalized Cantor Sets). A set is called a generalized Cantor 
set or a Cantor-like set if it is generated by some Cantor system. 


We summarize the above discussion in: 

Proposition 902. If P is the generalized Cantor set generated by a Cantor system 

J = (J,| u € {0,1}*), then the function y above maps {0, 1} bijectively onto P. 
Hence every generalized Cantor set is effectively bijective with {0,1}N and so 

has cardinality ¢ = 2®°, 


The bijection in the above proposition can be viewed as a correspondence between 
the infinite branches of the binary tree and the points of the generalized Cantor set P 
being generated: Each infinite branch through the binary tree determines a sequence 
of nested intervals, which in turn determines a point of the set P. 


Problem 903. Let (J,,| u € {0, 1}*) be a Cantor system. If Jy Jy # @, then show 
that one of the binary strings u and v must be an extension of the other (i.e., either 
u is an initial prefix of v or v is an initial prefix of u). 


Problem 904. Let (J,| u€ {0,1}*) be a Cantor system which generates the 
generalized Cantor set P. Show that 


1. For any x € P and6é > 0 there is u € {0, 1}* such that x € J, and len(J,) < 6. 
2. Every J, contains some point of P. 


Problem 905. Let P be the generalized Cantor set generated by a Cantor system 
(J, | u € {0, 1}*), and for eachn = 0,1,2,... let F,, be the union of the 2” disjoint 
intervals J, where u is a binary string of length n, that is, 


Fy, c= ul len(u) = n}. 


Show that 


P=) Ee 


n 


Chapter 14 
Real Sets and Functions 


Abstract This chapter covers the basic topology of the real line. Many of the 
notions of this chapters, such as derived sets, closed sets, dense-in-itself sets, and 
perfect sets, were first introduced by Cantor during his study of the real continuum. 


14.1 Open Sets 


Definition 906 (Open Sets). A set G C R is called open if every point of G 
belongs to some open interval contained in G, that is if for every p € G, there 
is an open interval J such that p ¢ J CG. 


Note that every nonempty open set contains a nonempty open interval and hence 
must be uncountable. Thus no nonempty countable set can be open. Also, no 
nonempty bounded closed interval is open. 


Problem 907. Show that in the definition of open sets we can replace “open 
interval” with “bounded open interval.” 


Problem 908. Show that a set is open if and only if it can be expressed as the union 
of some family of open intervals. 


Problem 909. 1. The empty set © and R are open sets. 
2. The union of any collection of open sets is open. 
3. The intersection of finitely many open Sets is open. 


Problem 910. Show that the intersection of infinitely many open sets may not be 
open. 


Problem 911 (Countable Base). Let B := {(p.q)| p,q € Q.p < q} be the 
collection of all nonempty bounded open intervals with rational endpoints. Then B 
is a countable collection of open intervals, and every open set is a (countable) union 
of members of B. 


Problem 912. Show that there are exactly 28° = ¢ many open sets. 
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Problem 913 (The Countable Chain Condition). Show that every family of 
pairwise disjoint nonempty open sets is countable, and hence every family of 
pairwise disjoint nonempty open intervals is countable. 


(Hint: Every nonempty open interval contains a rational number.] 


Problem 914. Let A C Rand suppose that Vx,y € A,x <z< y =ze A. Show 
that A must be an interval. If in addition A is open show that A must be an open 
interval. 


[Hint: If A is nonempty bounded, put a = inf A, b = sup A, and show that A must 
be one of (a,b), [a, b], (a, b] or [a, b). If A is neither bounded above nor bounded 
below, A must be R. If A is bounded below but not above, A must be one of (a, 00) 
or [a, co) where a = inf A. Etc.] 


Problem 915. Let C be a collection of open intervals having nonempty intersection 
(so there is p such that p € I for every I € C). Show that the union UC of all the 
intervals in C is itself an open interval. 

In particular, the union of two open intervals having nonempty intersection is an 
open interval. 


Problem 916. Let C be a family of pairwise disjoint nonempty open intervals, and 
let G = UC. Show that if I is a nonempty open interval contained in G, then I is 
contained in a unique member of C. 


Problem 917. Let G be a nonempty open set and for x, y € G write x ~ y if there 
is an open interval containing both x and y. Show that ~ is an equivalence relation 
on G which partitions G into nonempty open intervals. 


From the last few problems we get a canonical decomposition of each open set into 
a unique countable family of disjoint open intervals. 


Theorem 918 (Canonical Decomposition of Open Sets into Disjoint Open Inter- 
vals). Every open set can be expressed as the union of a unique (countable) family 
of pairwise-disjoint nonempty open intervals. 


14.2 Limit Points, Isolated Points, and Derived Sets 


For general orders, we had defined the notions of upper and lower limit points, 
derived sets, bounds, supremum, etc, and the same definitions apply for R: 


Definition 919 (Limit Points and Derived Sets). Let A C R. 


1. A point p € Ris an upper limit point of A if for all x < p there is y such that 
x < y < p. Lower limit points are defined similarly. 
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2. A point is a limit point of A if it is either a lower or an upper limit point of A. 
The set of all limit points of A is called the derivative or derived set of A and 
will be denoted by D(A). 

3. A point p € A is an isolated point of A if p is not a limit point of A, that is if 
p € AXD(A). 

4. A limit point of A which is both an upper and a lower limit point of A will be 
called a two-sided limit point of A; otherwise it is a one-sided limit point of A. 


Problem 920. For each of the following sets, find D(A). 


1. Z. 
2. {1/n|n € N}. 
3. QN (0, 1). 


4. {1/2"+1/2"t"| m,n € N}. 


Problem 921. Give an example of an infinite set A such that A has arbitrarily close 
points (for any p > 0 there are x,y € A with 0 < |x — y| < p) but A has no limit 
points (D(A) = ®). 


Problem 922. Let E be the set all points x € [0,1] having a ternary expansion 
x = YO, Xn/3" for which there is k such that x, = 0 or 2 forn < k and x, = 1 
for alln > k (i.e., any point x € E has ternary expansion of the form x = 0- 
XyXQ°++Xp—p L111 --+ with x1, x2,..., Xe—1 € {0, 2}). 


1. Which points of A are limit points of A? 
2. Find D(A). 


Problem 923. Show that p € R is a limit point of A if and only if every open 
interval containing p contains a point of A other than p if and only if every open 
interval containing p contains infinitely many points of A. 

Show that p is an isolated point of A if and only if I 0 A = {p} for some open 
interval I. 

Show that p is not a limit point of A if and only if 1 1 A © {p} for some open 
interval I containing p. 


Problem 924 (Properties of D(A)). For any sets A and B we have: 


1. AC B= D(A) C D(B). 
2. D(AU B) = D(A) U D(B). 
3. D(D(A)) € D(A). 


Problem 925. Give an example of a set A such that 
BS D(D(A)) € D(A) EA 


(all inclusions being proper). 
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14.3 Closed, Dense-in-Itself, and Perfect Sets 


Definition 926 (Cantor). A set A C R is called 

1. Closed if every limit point of A is in A, i.e. if D(A) C A. 

2. Dense-in-itself if every point of A is a limit point of A, i.e. if A C D(A). 
3. Perfect if it is both closed and dense-in-itself, i.e., if D(A) = A. 


Some examples: 


¢ Any finite set is closed. The set D(A) of limit points of any set A is closed (recall 
that D(D(A)) € D(A) in orders, and so in R too). 

¢ The set Z of integers is closed but not dense-in-itself, while the set Q of rational 
numbers is dense-in-itself but not closed. 

¢ Every proper closed interval is perfect. The Cantor set is perfect. 

¢ The set A := {1, s, i, cote ‘, ... } 1s not closed since A has a (unique) limit point 
0 which is not in A. But adjoining this limit point 0 to the set A gives a closed set 
A U {0}. This method is fully general, and leads to the notion of closure. 


Definition 927 (Closure). The closure A of A is the set A := AU D(A). 


Theorem 928. For any set A, its closure A = AU D(A) is closed. In fact, A is the 
smallest closed set containing A. 


Proof. We have: 
D(A) = D(AU D(A)) = D(A) U D(D(A)) € D(A) U D(A) = D(A) CA, 


and so A is closed. 
Now if B is any closed set containing A, then since D(B) C B, 


A= AUD(A)CAUD(B)CAUB=B. 


Thus the closed set A is contained in every closed set containing A, and hence A is 
the smallest closed set containing A. Oo 


It follows immediately from the definitions of closure and of closed sets that: 
A is closed if and only if A = A. 
Problem 929. p ¢€ A if and only if every open interval containing p has nonempty 


intersection with A. Hence p ¢ A if and only if there is an open interval I 
containing p such that N A = @. 


Problem 930. Prove that A =AandAUB=AUB. 
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Problem 931. Let A be a closed set. If B is anonempty bounded subset of A then 
inf B € A and supB é€ A. In particular, if A is nonempty, closed, and bounded, 
then inf A € A and supA € A. 


Proposition 932. A set is closed if and only if its complement is open. 


Proof. Let A and B be complements of each other so that AM B = @ and 
AU B=R. We show that A is closed if and only if B is open. 

If A is closed so that A = A, then for any x € B we have x ¢ A hence there is 
an open interval J such that x € J and ] NM A = @, which means x € J C B. Thus 
B is open. 

If B is open then for any x € B there is an open interval J with x € J C B, 
hence x € J and MA = @, and sox ¢ A. Thus no point of B is in A which means 
A C A, and so A is closed. oO 


Corollary 933. 1. The empty set © and R are closed. 
2. The intersection of any collection of closed sets is closed. 
3. The union of finitely many closed sets is closed. 


Problem 934. [f A is dense-in-itself, G is open, and AN G # @, then ANG isa 


nonempty dense-in-itself set. 
Proposition 935. The Cantor set is perfect. 


Proof. Let K be the Cantor set. Then K = 1,,K,,, where K,, is the union of 2” 
closed intervals obtained at stage n of the construction of the Cantor set. Now each 
K,, is closed, being a finite union of closed intervals. Hence K is closed, being the 
intersection of the sequence of closed sets K,,. 

To see tat K is dense-in-itself: Given any x € K and any open interval (a, b) with 
a <x <b, pick n large enough so that 1/3” < min(x — a,b — x). Since x € Ky, 
so x is in one of the 2” closed intervals of length 1/3” making up K,,, say in [c, d]. 
Then [c,d] € (a,b) since d —c = 1/3”. Now both c and d are in K, but either 
c #x ord # x, and so (a,b) contains a point of K other than x. Thus x is a limit 
point of K. Hence K is dense-in-itself. 

Thus K is perfect. Oo 


Example 936. The complement of K is open, and therefore can be decomposed 
into a unique family of disjoint open intervals. Note that this decomposition of Rx 
K consists of the two unbounded open intervals (—oo, 0) and (1,00) as well as 
infinitely many bounded open intervals 


1 2 1 2 78 1 2 4.5 19 20 25 26 
3° 3)» (555 ’ 5° 5)» (37+ #7) (37 37): (57+ 37): (33+ 37): ace 


Problem 937. Show that there are exactly ¢ many closed and perfect sets. How 
many dense-in-itself sets are there? 


Problem 938. If A and B are closed sets with AN B = @ then there exist open 
sets U and V with ACU, B CV, andU NV = @ (any two disjoint closed sets 
can be separated by disjoint open sets). 
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[Hint: Let U be the union of all open intervals J = (a,b) such that AN J 4 @ but 
BN (a—len(/), b + len(/)) = @. Similarly define V.] 


Definition 939 (Eventual Containment). A sequence (x;),,en is eventually in a 
set A if there is m € N such that x, € A foralln > m. 


Definition 940 (Convergence and Limit). A sequence (X,),,<nj of real numbers 
converges to a real number x, written as (x,) > x or aS X, > x asn — oo, if for 
any open interval J containing x, the sequence is eventually in J. If (x,) > x, we 
also say that x is a limit of the sequence (Xn) nen. 


Problem 941 (Uniqueness of Limits of Sequences). Jf (x,) — x and (x,) > x’ 


then x = x’. (A limit of a sequence, if it exists, is unique.) 


Thus the limit of a convergent sequence is also written as: 


lim x, or limx,. 
noo n 


Definition 942 (Cauchy Sequences). A sequence (x,),<x of real numbers is a 
Cauchy Sequence if for any € > 0 there is k € N such that |x, — x,| < ¢€ for 
allm,n>k. 


Proposition 943 (The Cauchy Criterion for Convergence). A sequence (Xn) nen 
is convergent if and only if it is Cauchy. 


Proof. If (xn) — x, then, given any € > 0 we can fix & such that |x, —x| < § for 
alln > k, 80 |Xm — Xn| S [Xm — x| + |x —Xn| < § + § =€ forallm,n >= k. 
Conversely, suppose that (x;),,ex is a Cauchy sequence. Then we can fix k such 
that |X» —X,| < 1 for all m.n > k, and so x, € [xp —1,x~% + 1] for alln > k. 
Thus (X,),,cn is eventually in the interval J; := [x, — 1, xz + 1]. Now note that if 
a sequence is eventually in an interval [a,b] of length £ = b — a, then it is either 
eventually in [a, b — | or in [a + - b]. Hence, starting with 7,, we can recursively 
define a nested sequence of intervals J; > J, D --- such that len(J,41) = $ len(Jin) 
and (x;,),en 1S eventually in J, for all m. Let x be in (),, Im. Since len(,) > 0, 
any open interval J containing x must also contain some J,,, and so (Xn)n,en is 
eventually in J. Thus (x,) > x. oO 


Problem 944 (CAC). x € A if and only if there is a sequence (Xn) ,exn converging 
tox with x, € A foralln EN. 
14.4 Dense, Discrete, and Nowhere Dense Sets 


Definition 945. 1. A set A C R is called everywhere dense or simply dense (in R) 
if every point of RW A is a limit point of A, that is if A = R. 
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2. More generally, we say that A is dense on B if every point of B~ A is a limit 
point of A, that is if B C A. If in addition A C B, we say that A is a dense 
subset of B or that the subset A is dense in B. 


For example, Q is (everywhere) dense in R. 


Problem 946. Let E be the set of end points of the open intervals removed in the 
construction of the Cantor set. Show that E is a dense subset of the Cantor set. 


Proposition 947 (CAC). Every set has a countable dense subset. 


Proof. Let E C R be nonempty, and consider the collection 


C:={EN(p,qg)| p.g€QEN (p,q) FB} 


of nonempty sets which are intersections of E with open intervals with rationals 
endpoints. Then C is a countable collection of nonempty subsets of E, and so by 
the Countable Axiom of Choice there is a function g: C > UC such that g(A) € A 
for all A € C. Put: 


D := {9(A)| A eC}. 


Then D is a countable subset of E. We claim that D is dense in E, that is E C D. 
It suffices to show that for every x € EF and any open interval J with x € I we have 
IND#®@. 

Let x € E, and let J = (a,b) be an open interval with x € 7. We can fix rational 
numbers p and g such thata < p < x <q < b,so that EN (p,q) # O@ and so 
EN (p,q) € C. Put y = g(E N (p,q)). Then y € EN (p,q) CI andy € D,so 
IND#®@. Oo 


Problem 948. A set A is everywhere dense if and only if every nonempty open 
interval (or nonempty open set) has nonempty intersection with A. 

A set A is dense ona set B if and only if every nonempty open interval which has 
nonempty intersection with B also has nonempty intersection with A. 


Problem 949. /f A is dense on B and B is dense on C then A is dense on C. 
Problem 950. Find two disjoint sets each of which is dense. 

Problem 951. A dense subset of a dense-in-itself set is dense-in-itself. 
Proposition 952. In any set, all but countably many points are limit points. That is, 
the set AX D(A) of isolated points of a set A is countable. 


Proof. For each x € A~ D(A), since x is an isolated point of A, we can choose 
an open interval (a, b) with (a,b) N A = {x}, and then fix rational numbers px, dx 
such that a < py <x < qx <b. For each x € AX D(A) we have 


(Px. dx) 1A = {x}, 
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and so if x # y are in AX D(A) then (px, qx) # (py, gy). Hence the mapping 


X b> (p x? dx) 
is a one-to-one mapping from A~ D(A) into the countable family of intervals with 
rational endpoints, and so A~ D(A) must be countable. Oo 


Definition 953. A set A C R is called discrete if each point of A is an isolated 
point of A, that is if AM D(A) = @. 


By the previous proposition, every discrete set is countable. Some examples of 
discrete subsets of R are N, Z, and the set (i |n © N}. 


Problem 954. Show that the union of two closed discrete sets is discrete. 


Problem 955. Show that the set A := NU {nV/2| n € N} is discrete, but that 
A has arbitrarily close points, that is for any € > 0 there exist p,q € A with 
0<|p-—4q|<e. 


Definition 956 (Nowhere Dense Sets). If a set is dense on some nonempty open 
interval, we call it somewhere dense; otherwise, we call it nowhere dense. 


Clearly any subset of a nowhere dense is nowhere dense. 


Problem 957. A set is nowhere dense if and only if its closure does not contain any 
nonempty open interval. Hence a closed set is nowhere dense if and only if it does 
not contain any nonempty open interval. 


Since the Cantor set is closed and does not contain any nonempty open interval, we 
have: 


Proposition 958. The Cantor set is nowhere dense. 


Problem 959. A set A is nowhere dense if and only if every nonempty open interval 
contains a nonempty open subinterval which is disjoint from A. 


Problem 960. Prove that a set is nowhere dense if its complement contains a dense 
open set. 


Problem 961. The intersection of two dense open sets is a dense open set. 
Problem 962. The union of two nowhere dense sets is nowhere dense. 


Problem 963. Consider the collection C of the open intervals removed in the 
construction of the Cantor set. Since the open intervals in C are nonempty and 
pairwise disjoint, C can be naturally ordered by the usual order onR: For I,J € C 
we have I < J if and only if x < y forallx € I andall y € J. Show that under 
this natural ordering, C becomes a dense order of type n. 


Problem 964. Give an example of a countable set disjoint from the Cantor set 
which is dense on the Cantor set. 
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Problem 965. Give an example of an infinite discrete set A C R such that the 
suborder A is a dense order, that is, for any x < y in A there is z in A with 
x<ZzZ<y. 


Problem 966. Give an example of a discrete subset A C R such that D(A) is an 
uncountable perfect set. 


Problem 967. Show that the closure of a discrete set is nowhere dense, and so if A 
is discrete then D(A) is nowhere dense closed. 


Problem 968° Show that if E is nowhere dense closed then E = D(A) for some 
discrete set A. 


Condensation Points 


Definition 969 (Condensation Points). A point p is called condensation point of 
a set A if every open interval containing p contains uncountably many points of A. 


Clearly, every condensation point of a set is a limit point of the set. 
The following is an important generalization of Proposition 952. 


Theorem 970. All but countably many points of a set are condensation points. 


Proof. Let C be the set of condensation points of a set A. We show that AXC is 
countable. Put 


H:= {AN (p,q)| p.g € Q, |AN (p,q) < Ro. 


Then H is a countable collection of countable subsets of A, and so FE := UH isa 
countable subset of A. We claim that A~C C E: Let x € AXC, so that x is nota 
condensation point of A, and hence there is an open interval (a, b) with x € (a,b) 
and |A M (a,b)| < &o. Fix p,q € Q such thata < p < x < q < b, so that 
AN (p,q) € H, therefore AN (p,q) € FE, and so x € E. Hence AXC C E, and 
so ANC is countable. oO 


Clearly, if x is a condensation point of A, and B is any countable subset of A not 
containing x, then A~ B will still have x as a condensation point. Hence by the 
theorem, if C is the set of condensation points of an uncountable set A, then C is an 
uncountable set in which every point is a condensation point. Therefore we have: 


Corollary 971. The set of condensation points of any uncountable set forms a 
nonempty subset which is dense-in-itself. 


Using a routine argument, we have the following result. 
Problem 972. The set of condensation points of a closed set is closed. 


The important Cantor—Bendixson Theorem is now an immediate corollary. 
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Corollary 973 (The Cantor—-Bendixson Theorem). Any uncountable closed set is 
the union of a nonempty perfect set and a countable set. 


The Cantor—Bendixson Theorem will be proved again more effectively in Chap. 16 
(Theorem 1079 and Corollary 1080 in Sect. 16.2). 


We had seen in Theorem 598 that every nonempty perfect subset of a complete 
order has cardinality ¢. Hence it follows from the Cantor—-Bendixson Theorem that 
“the closed sets satisfy the Continuum Hypothesis” in the following sense. 


Corollary 974. Every closed set is either countable or has cardinality c. 


A different proof of the last result (a proof without using Theorem 598) will be given 
in Chap. 15 (Corollary 1051). 


Problem 975. Show that all but countably many points of a set are two-sided limit 
points of the set. 


Problem 976. Find an example of a nonempty dense-in-itself set in which no point 
is an upper limit point of the set. 


Generalized Cantor Sets Are Perfect Nowhere Dense 


Theorem 977. Every generalized Cantor set is a bounded perfect nowhere dense 
set which is bijective with {0, 1} (and so has cardinality ¢ = 28°). 


Proof. Let A be a generalized Cantor set generated by the Cantor system 
(J, | u € {0, 1}*). We had already seen that A is bijective with {0, 1}‘ and hence 
has cardinality ¢ = 28°. 

First note that A is bounded, as A C J, = [a,b] for some a < b. Also, A 
is closed, as (by Problem 905) it is an intersection of a collection of closed sets. 
To see that A is dense-in-itself, fix p € A and an infinite binary string b|b---b,--- 
such that p € MyJp,45--»,- Given an open interval (c,d) containing p, choose a 
sufficiently large n for which len(Jy,»,...5,) < min(p —c,d — p), so that Jp,p,..5, S 
(c,d). Let u be the finite binary string b)b2---b,. Then at most one of the disjoint 
intervals J,~9 and J,~; can contain p, so by taking v to be either u~0 or ul, 
we can assume that p ¢ J,. Now fix any infinite binary string extending v and let 
q be the unique point in the intersection of the corresponding nested sequence of 
intervals. Then g € J, andso p 4 q while p € (c,d) A. Thus every open interval 
containing p contains a point of A distinct from p, which means p € D(A). Since 
p was chosen arbitrarily from A, it follows that every point of A is a limit point, so 
A is dense-in-itself. Hence A is perfect. 

To show that A is nowhere dense, it will suffice to show that A does not contain 
any proper interval. Let (c,d) be a nonempty open interval. We show that A does 
not contain all points of (c, d). Fix p € (c,d). If p ¢ A we are done, so assume that 
p € Aand let bjb2---b, +++ be an infinite binary string such that p € NyJb,5,--b,- 
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Since lim, len(Jj,,,...5,) = 0, we can choose a sufficiently large n for which 
len(Jj,,--b,) < min(p — c,d — p) so that Jj,,,..», ¢ (c,d). Let u be the finite 


binary string b,b.---b,. Since J,-9 and J,-; are disjoint closed subintervals of 
J, there must exist some g € J, which does not belong to any of the intervals 
Jico or J,- 1 (since by Theorem 599 an interval cannot be the disjoint union of two 
nonempty closed intervals). We now claim that g ¢ A. To see this, assume that 
q € A (to get a contradiction), and fix a finite binary string v with len(v) > len(u) 
such that g € J,. Theng € J, N J,, so J, N J, # @, and so v must be an extension 
of u. But since len(v) > len(u), v must either be an extension of J,-~9 or be an 
extension of J,,~,, which implies that g is either in J,<9 or in J,-1, contradicting 
the fact that g is not in any of these two intervals. Oo 


We will later prove (in the chapter on Brouwer’s theorem) that a converse of the 
result also holds: Every bounded perfect nowhere dense set is a generalized Cantor 
set, that is A can be generated by some Cantor system. 


14.5 Continuous Functions and Homeomorphisms 


We had already defined continuous functions for orders, but we now want to define 
continuity for functions which are only partially defined on R, i.e., for functions 
whose domain may be a proper subset of R. It is assumed that the reader is familiar 
with this notion of continuity through elementary calculus. 


Definition 978 (Continuous Functions). Let A C Rand let f: A > R. 


1. We say that f is continuous ata point p € A if for any open interval J containing 
Ff (p) there is an open interval J containing p such that for allx,x ETN A> 


f(x)e J. 


2. We say that f is continuous (on A) if for all p € A f is continuous at p. 
Problem 979. Enumerate the rational numbers as Q = {r, | n € N} where rm 4 
r, form # n, and define f:R — R by: 

l/n ifx=r,andneéN, 


0) otherwise. 


f(x) = 


Show that f is continuous at each irrational point, but is discontinuous (not 
continuous) at each rational point. 


Problem 980. Show that if A is a discrete set then any function defined on A is 
continuous. 


Problem 981. Let f:R — R. Show that each of the following conditions is 
necessary and sufficient for f to be continuous on R. 
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1. For any x € Rand any open interval J containing f (x) there is an open interval 
I containing x such that f [I] C J. 

2. For any open interval J the inverse image f—'[J] is an open set. 

3. For any open set G the inverse image f —'[G] is an open set. 

4. For any closed set F the inverse image f—'{F] is a closed set. 


Problem 982. Show that f: A > R is continuous if and only if for any x € A and 
any sequence (Xn) ,en With x, € A for all n, if (X,) — x then ( f(Xn)) > f(x). 


Problem 983. Let f:R — R be continuous. Show that for any sequence of nested 
closed intervals I, D In D +++ D In D Indi D---, iflenUn) > 0 asn — oo then 


FS lAntn] = Ont Un): 


We had earlier proved the Intermediate Value Theorem for general linear continu- 
ums (Corollary 600). Since an interval in R is a linear continuum, the IVT remains 
true for this case. 


Theorem 984 (IVT). Let I be an interval and f:I — R be continuous. Then 
F(Z] is an interval. In other words, ifa < b are in I and if f(a) < y < f(b) orif 
F(@ > y > f(b) then there is x € (a,b) such that f(x) = y. 


Let f:R > R. Fora,b € R, let f? denote the function obtained from / by 
redefining its value at a to b,ie., f(x) := bifx = a and f?(x) := f(x) 
otherwise. We say that f has a removable discontinuity at the point a if f is 
discontinuous at a but 7 is continuous at a for some b € R. (In terms of limits, 
this means that lim,-., f(x) exists but does not equal f(a).) 


Problem 985. Let f:R — R and let E be the set of points at which f has a 
removable discontinuity. Then E is countable. 


[Hint: To each a € E, assign rational numbers p,q,r,s such that a € (p,q) and for 
all x € (p,q) with x # a we have f(x) € (7,5) but f(a) ¢ (7, 8).] 


Homeomorphisms and Homeomorphic Sets 


Definition 986. Let A,B C R. A homeomorphism from A to B is a bijection f 
from A onto B such that both f and f~! are continuous. The sets A and B are 
homeomorphic if there is a homeomorphism from A onto B. 


For example, the mapping x +> x/(1 + |x|) is a homeomorphism from R onto the 
open interval (—1, 1). 


Problem 987. Let f:R — R be continuous and strictly increasing. Show that 
JF [R] is an open interval (which may be unbounded) and that f is ahomeomorphism 
from R onto f [R]. 


Problem 988. Any two infinite discrete sets are homeomorphic. 
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Problem 989. The set {0} U {1/n|n € N} is homeomorphic to the set {0} U {1/n| 
n€N}U {-1/n| n EN}, but not to {0} U {1/n| n EN}UN. 


Problem 990. No two of [0, 1], [0, 1), and (0, 1) are homeomorphic. 
Problem 991? The sets [0,1] 1 Q and (0, 1) N Q are homeomorphic. 


Problem 992. Let I and J be nonempty open intervals (none, one, or both of which 
may be unbounded), A be a countable dense subset of I, and B be a countable dense 
subset of J. Show that there is a homeomorphism of I onto J such that f [A] = B, 
and conclude that A and B must be homeomorphic. 


{Hint: Use Cantor’s theorem on countable dense orders. ] 

If A and B are homeomorphic via f, then f will preserve all internal properties 
of points and subsets of A involving limit points. For example, if A has exactly three 
limit points, then B the same will be true for B. If p € A and E C A, then p isa 
limit point of E if and only if f(p) is a limit point of f[E], E is a dense subset of 
A if and only if f[E] is dense subset of B, and so on. 


Problem 993. Let A be homeomorphic to B. Show that 


1. A is discrete if and only if B is discrete. 
2. A contains a proper interval if and only if B contains a proper interval. 
3. A is dense-in-itself if and only if B is dense-in-itself. 


Internal properties of sets which are preserved by homeomorphisms are called 
topological properties. Thus the properties listed in the last problem are topological. 
If two sets are homeomorphic then they will share all topological properties and are 
said to have identical (internal) topological structures. 

Properties of a subset A of R which express how A, its points or other subsets of 
A relate to the parent R (i.e., properties which express how A is situated within R) 
are in general not topological. This is illustrated by the following problem. 


Problem 994. None of the following properties of subsets of R is a topological 
property, i.e., it is not preserved by homeomorphisms in general. 


1. Being bounded. 

2. Being closed. 

3. Being everywhere dense. 
4* Being nowhere dense. 


Problem 995. None of the properties of being closed and being nowhere dense is 
a topological property, but if A and B are homeomorphic closed sets then A is 
nowhere dense if and only if B is nowhere dense. 


While none of the individual properties of being closed and being bounded is a 
topological property, the combined property of being both closed and bounded 
becomes a topological property and is known as compactness, but the proof of this 
fact requires the Heine—Borel theorem which is covered in the next chapter. 
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Space Filling Peano Curves 


By a (parametric) continuous curve (x(t), y(t)), 0 < t < 1, in the plane we mean 
a pair of continuous functions t x(t) andt % y(t),0 < t < 1. The classic 
example is that of the unit circle x(t) = cos2zt, y(t) = sin2at. Peano showed 
the surprising result that there is a continuous curve in the plane which fills up 
the entire unit square [0, 1] x [0, 1]! We can readily derive Peano’s result using the 
“identification” of the Cantor set K with the set {0, 1} of infinite binary sequences 
(review Sects. 6.6 and 6.7). 
For each a: N — N, define hy: K — [0, 1] by: 


[oe 

Xa(n x 

hy(x) = > ss (x = 0%, BM EK, (xn) € {0,1}9), 
n=1 

Problem 996. [fa:N — N is injective then hy is a continuous surjection from K 

onto [0, 1]. Ifa, B: N — N are injective with disjoint ranges, then for all (a,b) € 

[0, 1] x [0, 1] there is x € K witha = h(x), b = hg(x). 


Let h := h, where 1: N > N is the identity map i(”) = n. 


Problem 997. For the functionh = h,, if (a, b) is a component open interval of the 
complement of the Cantor set, then h(a) = h(b). 


The function h 


By the last problem, one can extend h to a function h: [0, 1] — [0, 1] by “joining 
h(a) and h(d) with a horizontal line segment” over each component open interval 
of [0, 1]\K. The resulting function h will be a continuous function (which maps K 
onto [0, 1]). More generally, we have: 


Problem 998 (Cantor Ternary Functions). Let g:K — [a,b] be continuous. 
Then there is a unique continuous extension g:[0,1] — [a,b] of g such that g 
is linear on each component open interval of |0, 1]~K. 
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Now fix injective aw, 6:N — N with disjoint ranges, e.g., a(n) = 2n — 1 and 
B(n) = 2n, and put A := hy, p= hg. By Problem 998, (A(t), p(t)),0 < t < 1, is 
a continuous curve in the plane, and by Problem 996 it fills up the unit square. We 
thus have: 


Corollary 999 (Peano Curve). Let a(n) := 2n — 1, B(n) := 2n, A := hy, and 
p := hg. Then (A(t), p(t)), 0 < t < 1, is a continuous curve in the plane which 
fills up the entire unit square. 


Chapter 15 
The Heine-Borel and Baire Category Theorems 


Abstract This chapter starts with the Heine—Borel theorem and its characterization 
of complete orders, and then uses Borel’s theorem to give a measure-theoretic 
proof that R is uncountable. Other topics focus on measure and category: Lebesgue 
measurable sets, Baire category, the perfect set property for Gs sets, the Banach— 
Mazur game and Baire property, and the Vitali and Bernstein constructions. 


15.1 The Heine—Borel Theorem 


Earlier, we had encountered the Bolzano—Weierstrass and Nested Intervals prop- 
erties for complete orders and saw that none of those properties characterize 
complete orders. On the other hand, the Heine—Borel theorem, which has very 
wide applicability, gives a stronger condition which actually characterizes complete 
orders. 


Definition 1000. Let A be a set and C be a collection of sets. We say that A is 
covered by C or C covers A if every element of A is a member of some set in C, 
that is if 


Ac|Jc. 


We also say that A is covered by the sets A,, Az,..., A, if A is covered by the 
collection {A,, Az,..., An}. 


Theorem 1001. Let [a,b] be a bounded closed interval and let C be a collection 
of open sets which covers |a, b]. Then [a,b] can be covered by finitely many sets 
from C. 


Proof. Let a < b and let C be a collection of open sets with 


[a,b] o\ Jc. 
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Let A be the set consisting of those real numbers x € [a, b] such that the interval 
[a, x] can be covered by finitely many sets from C. In other words, we have x € A 
if and only if a < x < b and there are sets Gj, Gz,..., G, € C (for some natural 
number 72) such that [a,x] G U?_,G,. Then A C [a, b] is bounded and nonempty 
(since a € A), and so with 


c := supA 


we have c € [a, b]. Since c € [a, b] there is an open set G € C such that c € G, and 
so we can fix an open interval (p,q) such that c € (p,q) CG. 

Since p < c andc = sup A, there is r € A such that p <r <c. Thenr € [a, D] 
and [a, r] can be covered by finitely many sets from C, say 


la.r] c iB kecs G, € Cfork = 1,2,...,.n (NEN). 
k=1 


Putting G,+41 := G, we see that [r,c] C Gz+1, and so 


n+l 


[a.¢] = [a.rJUlrne] SUG. 
k=1 


hence [a,c] can be covered by finitely many sets from C and so c € A. Moreover, 
if we had c < b, we could choose s with c < s < min(b,q) and so we would get 
[r,s] © G = Gy41. Then [a, s] = [a,r] U [r,s] would again be covered by the sets 


Gi, Go,..., Gn, Gri, which would imply s € A, which is a contradiction since 
Ss > c = supA. Therefore c = b, hence b € A, and so [a, b] can be covered by 
finitely many sets from C. oO 


The following generalization of Theorem 1001 is usually called the Heine—Borel 
theorem. It is sometimes paraphrased as “every open cover of a bounded closed set 
has a finite subcover.” 


Corollary 1002 (Heine-Borel). Let E be a bounded closed set and let C be a 
collection of open sets which covers E. Then E can be covered by finitely many 
sets from C. 


Proof. Since E is bounded there exist a, b with E C [a,b]. Let G = R~E, so that 
G is open, and put C’ = C U {G}. Then C’ covers R, since for any x either x € E 
in which case x € UC or else x € G. Hence C’ covers [a, b], so [a, b], and hence 
E, can be covered by finitely many sets from C’, say E C U?_,Gx with Gx € C’ 
fork = 1,2,...,n. Now from the sets G;, k = 1,2,...,n, we can remove any 
G, which equals G, and the remaining sets will still cover F (since EG = @), 
giving us finitely many sets from C which covers E. oO 
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Corollary 1003. If F\, Fo,..., F,,... are bounded closed sets, G is an open set, 
and OW. Fy & G, then Ni'_ Fe © G for some m €N. 


n=1 
Proof. Put G, := R~F,,. Then MN, F, © G means that F; C GU ( Ue, Cas and 
so by the Heine—Borel theorem there is m such that F; C GU ( UpL, Gx), that is, 
yy SG. oO 


Corollary 1004. The intersection of a nested sequence of nonempty bounded 
closed sets is nonempty. 


Problem 1005. Let A be a subset of R. Show that if every covering of A by a 
collection of open sets has a finite subcollection which also covers A, then A is 
closed and bounded. 


The following is also known as the Heine—Borel Theorem (in a “necessary and 
sufficient” form). 


Corollary 1006 (The Heine-Borel Theorem). A subset A of R is closed and 
bounded if and only if every covering of A by a collection of open sets has a finite 
subcollection which also covers A. 


This last and final version of the Heine—Borel theorem is a characterization of the 
property of being a closed and bounded subset of R using a property involving 
covering by open sets. It can be used to show that the property of being a closed and 
bounded subset of R is preserved by homeomorphisms, that is, it is a topological 
property. Thus even though the individual properties of being closed or being 
bounded are not topological, the combined property becomes a topological property, 
and is termed as compactness. 


Problem 1007. Show that if A and B are homeomorphic subsets of R and one of 
them is closed and bounded then so must be the other. In other words, compactness 
is a topological property. 


It should be noted that the proof of Theorem 1001 uses only the completeness 
property of R, and hence can be generalized for orders. We will say that an order X 
satisfies the Heine—Borel condition if for any bounded closed interval J covered by 
a collection of open intervals there are finitely many of those open intervals which 
covers I. 


Problem 1008. Let X be an order without endpoints, Show that X is complete if 
and only if X satisfies the Heine—Borel condition. 


Definition 1009. The Jength of an interval J is defined as the nonnegative quantity 
len(/) := sup J —inf J, and the total length of a sequence (J,,) of intervals is defined 
as the nonnegative sum >>, len(Z,,). 


We will now apply the Heine—Borel theorem to obtain the following result known as 
Borel’s Theorem which is useful in theory of Lebesgue-measure: An interval [a, b] 
cannot be covered by a sequence of open intervals having total length < b — a. 


First, we need the following proposition. 
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Proposition 1010. Let [a,b] be a closed interval and suppose that (ax, by) is an 
open interval with a, < b; for eachk = 1,2,...,n. If 


[a,b] S | J@x. be), 


k=1 


then we have 


b-a<) (by -a). 


k=1 


Proof. The proof is done by induction on n for the statement of the proposition. 
We assume that a < b. 
If n = 1, then [a,b] € (a1,b)), soa; <a <b < bj, henceb—a <a,—b, = 


a1 Ox — ak). 
For the induction step, suppose that the proposition is true for 1 = m, and let 
(ax, by) be open intervals with a, < by foreach k = 1,2,...,m-+ 1 such that 


m+1 


[a,b] < (J @, be). 


k=1 


Then b is a member of one of the intervals covering [a,b], which we may assume 
to be (@n41,2m+41) without loss of generality. Then dy4; < b < by41. If now 
An+1 < a, then b—a < by+, —Gm+1 and we are done, so assume a < G+). Then 
since [€, dm+1] O (€m+1; Om+1) = O, we must have 


la, An+1] Cc LJ. bx), 
k= 1 


and hence by induction hypothesis we have: 


m 
Am4+1 —a < So (be — ak), 


k=1 
and so 


b—a < (Qm41 — @) + (0m41 — G41) < XC? — ax) + (bm41 — G41) 
k=1 


m+1 


= So (he - a4). 
k=1 


Thus the result holds forn = m+ 1. oO 
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An application of the Heine—Borel theorem now immediately yields Borel’s Theo- 
rem that no interval J can be covered by a sequence of open intervals having total 
length less than the length of J. 


Theorem 1011 (Borel’s Theorem). Let [a,b] be a closed interval and suppose 
that (ax, by) is an open interval with a, < by for eachk EN. If 


[a,b] S | (ax. be). 


k=1 


then we have 


b—a< ) (by — ax). 


k=1 


Problem 1012. [f A and B are disjoint bounded closed sets then there exists p > 0 
such that no interval of length > p intersects both A and B. 


[Hint: All intervals J = (a,b) with (2a — b,2b — a) NM B = @ form an open cover 
of A, and so has a finite subcover J, In,..., [,. Take p = min, len(Jx).] 


15.2 Sets of Lebesgue Measure Zero 


Definition 1013 (Lebesgue Measure Zero). E C Ris said to be a set of Lebesgue 
measure zero or simply a measure zero set if E can be covered by sequences of 
intervals of arbitrarily small total length, 1.e., if for any positive number € > 0 there 
is a sequence of open intervals ((a,,b,)| m € N) with 


EC\|Jl@n.bn) and Sb, — an) <e. 


n=1 n=1 


Clearly a subset of a set of measure zero is of measure zero. We also have: 
Proposition 1014. A countable union of measure zero sets has measure zero. 
Proof. Let A, have measure zero for each n € N, and let « > 0. Then for each m, 
since A,, has measure zero, there is a sequence ((4m.n,Dmn)| mn € N) such that 

aa 


(o.e) [o.e) 
Am Cc icar bmn) and Yom —dmnn) < 3m . 
n=1 


n=1 
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Then we have 


(ooo. 2) 
'e An cS 'S |) Gna, bmn) 
m=l1n=1 


with 
[oe [o.e) [oe 
€ 
2 Pr = Amn) < ym =e 
m=ln=1 m=1 
As € > 0 was arbitrary, it follows that U,, A, has measure zero, oO 


Since every singleton set is of measure zero, we get: 
Corollary 1015. Every countable set has measure zero. 
But not all measure zero sets are countable: 

Problem 1016. The Cantor set has measure zero. 


[Hint: The closed set K,, found at stage n of the construction of the Cantor set 
consists of 2” disjoint closed intervals each of length 1/3”, and so K,, can be covered 
by 2” open intervals of total length 2(2”/3”).] 


On the other hand, Borel’s Theorem (Theorem 1011) immediately gives exam- 
ples of sets not having measure zero: 


Corollary 1017. If a < b then the interval [a,b] does not have measure zero. 
Hence a measure zero set cannot contain a nonempty open interval. 


Corollary 1018. [fa < b then the interval [a, b] is uncountable. 


Thus we have another proof that R is uncountable. 


Small Sets, Ideals, and o -Ideals 


The above results indicate that in a sense the sets of measure zero are “small subsets” 
of R. Some other examples of collections of sets which may be regarded as “small” 
are: 


¢ The finite subsets of R. 

¢ The countable subsets of R. 

¢ The nowhere dense subsets of R. 
¢ The closed discrete subsets of R. 


15.3 Lebesgue Measurable Sets 287 


In general, the notion of “small subsets of R” can be axiomatized as: 


1. @ is small but R is not small. 
2. A subset of a small set is small. 
3. The union of two (or finitely many) small sets is small. 


All the above examples satisfy these properties. (In addition, they satisfy the 
stronger condition that no small set contains a nonempty open interval.) 

Any collection of sets which satisfies the three conditions listed above is called 
an ideal of sets. 

The collection of countable sets and the collection of measure zero sets satisfy 
an additional fourth property: 


4. The union of countably many small sets is small. 


Ideals which satisfy this additional property are called o-ideals. The sets of measure 
zero form an important o-ideal. 


The Borel Conjecture 


Definition 1019 (Strong Measure Zero). A set A is said to have strong measure 
zero if for any sequence (e, | € N) of positive numbers €, > 0, there exists a 
sequence ((d,,b,)| n € N) of open intervals such that 


lo. ) 
bn —Qn <€) forallneN and AC |_) Gu. bn). 


n=1 


Problem 1020. Show that the collection of sets of strong measure zero is a o-ideal 
containing all countable sets. 


Problem 1021. Show that the Cantor set does not have strong measure zero. 


The assertion that every set of strong measure zero is countable is known as the 
Borel conjecture. It can neither be proved nor be disproved using the usual axioms 
of set theory (provided that these axioms are consistent). 


15.3 Lebesgue Measurable Sets 


Definition 1022 (Lebesgue Measurable Sets). We say that A C R is Lebesgue 
Measurable (or simply measurable) if for any € > 0 there is an open set G and a 
closed set F with F C A C G, such that G~ F can be covered by a sequence 
of open intervals of total length less than e, i.e., there is a sequence (J, | n € N) of 
open intervals such that G. F C U,J, and >>, lenUIn) < €. 
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We will let L denote the collection of all Lebesgue measurable sets. 
The results in the following problem are immediate from the definition. 
Problem 1023. Show that 


1. The complement of a measurable set is measurable. 
2. Any set of measure zero is measurable. 
3. Every interval is measurable. 


Proposition 1024. Let A be a Lebesgue measurable set. If A is not of measure zero 
then A contains an uncountable closed set and hence a perfect set. 


Proof. Suppose that A does not contain any uncountable closed set. We show that 
then A must have measure zero. 

Given € > 0, fix closed F and open G with F C A C G and a sequence 
(I,) of open intervals covering G~ F with >°,, lenU,) < 5. By assumption, F 
must be countable, so there is a sequence (J,,) of open intervals covering F with 
>, len(Jn) < A The combined sequence of intervals /,, Jj, [2, Jz2,... covers A 
and has total length < e€. Since € is arbitrary, it follows that A has measure zero. O 


Problem 1025. A is measurable if and only if AQ (a,b) is measurable for all 
a<b. 


[Hint: If F,, € (n,n + 2) is closed for all n € Z, then U,ez F;, is closed.] 
Proposition 1026. Every open set is measurable. 


Proof. By the last problem, it suffices to show that every bounded open set is 
measurable. Let G be an open set contained in (a,b), and suppose that « > 0. 
Since G is open, we have G = U,,(dy,b,) for some sequence of pairwise disjoint 
open intervals (a,,b,),n = 1,2,.... 

Note that if J), J2,..., J, are finitely many pairwise disjoint open subintervals of 
(a, b), then by rearranging them we can assume J, = (cx, dg), k = 1,2,...,n with 
a<c, <d, <0 < db <-+++<c, < d, <b, and so the total length of the intervals 
sa len(J;,) cannot exceed b — a. In particular, we have Si (bk —ax) <b-a 
for any n, and so )°°.,(by — dn) < b —a. Hence there exists m € N such that 
remain =O) AS 

Now put F = U"_,[a,,b),], where a), = a, + < and by, = b, — <&. Then F is 
a closed subset of G and moreover G~ F is covered by the open intervals (ay, a’), 
n = 1,2,...,m, (6), bn), n = 1,2,...,m, and (dy,bn),n = m+1,m+2,..., 
whose total length is less than $+ $ + 5 =e. 


Theorem 1027. Let (A, | € N) be a sequence of measurable sets. Then A := 
Un An is measurable. 


Proof. Leta < b be arbitrary reals. It suffices to show that the set A* := AN (a,b) 
is measurable. Let € > 0 be given. 

Put A* := A, M (a,b). Then A* is measurable for each n, and so we can find 
closed F,, and open G,, such that F, GC A> C G, and G, ~ F, is covered by a 


15.3 Lebesgue Measurable Sets 289 


sequence of intervals of total length < =5;, where we assume that G, C (a,b) for 
all n (by replacing G,, by G, 1 (a, b) if necessary). Thus the open set V := U, (Gas 
F,,) can be covered by a sequence of open intervals of total length < }°"°, att 


2 

Let G := U,G,, so that G is an open set contained in (a, b), and let U,, := GF,,. 
Then U,, is open, hence measurable, and so there is a closed H,, C U, such that 
U,,~ H, is covered by a sequence of intervals of total length < aa , for each n. 
Therefore the open set U := = Un ~ H,,) can be covered by a sequence of open 
intervals of total length < °°, mo = 


Note that we have N, Hy CG A,7Uy C V, and hence by the Heine—Borel theorem 


NID 


there is m such that N7_, Hk. CV. 
Now put F = Uj_, Fy. Then F is closed with F € A* C G, and 


GNF= az Cc au U J(u. He) ¢ VUU. 
k=1 k=1 k=1 


Since each of V and U can be covered by a sequence of open intervals of total 
length less than 5, it follows that G~ F can be covered by a sequence of intervals 
of total length ice than e€. oO 


By the above results, the collection L of all Lebesgue measurable sets contains 
all measure zero sets, intervals, open and closed sets, and is closed under taking 
countable unions and complements of sets, and hence under forming countable 
intersections as well. This makes L a very comprehensive collection of sets, and 
in fact a sigma-algebra of sets, discussed in Sect. 18.1. 

The basic result in Lebesgue measure theory is Lebesgue’s landmark theorem 
that the length function for intervals can be uniquely extended to a nonnegative 
countably additive function on L: 


Theorem 1028 (Lebesgue). There is m:L — [0, co] such that 


1. mis countably additive: If A,, Ao,... are pairwise disjoint measurable sets, then 


m(U,, An) = >>, (An). 
2. m(1) = len(/) for any interval I (thus m(®) = 0). 


Proof. The proof is given in Appendix B. oO 


Such a function m must be uniquely defined on L. This and several other important 
immediate consequences are derived in the following corollary: 


Corollary 1029. Suppose that m is as in Theorem 1028, r € R, and A, B, 
A, Ao,...,An,... are measurable sets. The following properties hold: 


1. Monotonicity: A C B => m(A) < m(B). 

Countable Subadditivity: m((U,, An) < >¢, m(An). 

Uniqueness: fm’: L — [0, oo] also satisfies Theorem 1028 then m' = m. 
Outer Regularity: For any € > 0 there is open G D A with m(G~A) <e. 
Translation Invariance: m(A + r) = m(A), where A+r:= {x +r|x € A}. 


MWKRWN 
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6. CCC Property: If (E; | i € I) is an arbitrary family of pairwise disjoint measur- 
able sets, then m(E;) = 0 for all but countably many i € I. 


Proof. 1. AN (BSA) = @, so m(B) = m(A) + m(Bx A) = m(A). 

2. The sets By := An, 2, Ak are pairwise disjoint with LU, By = U,, An. 

3. Suppose m’ also satisfies the two conditions of Theorem 1028. Let E be 
measurable and « > 0. Fix closed F and open G with F C E C G and 
a sequence of intervals (J,,) covering G~ F with >>, len(/,) < ¢€. Since G 
is open, it is a disjoint union of intervals and so m(G) = m’(G). Hence 
m(E) +m(G~NE) = m(G) = m'(G) = m'(E)+m'(G~NE). Now m(G~E) < 
mV, In) < >, lenUn) < €, and similarly, m/(G~ E) < €. Hence m(E) and 
m’(E) cannot differ by more than e. 

4. Immediate from definition of measurability and monotonicity. 

5. Follows from (4), as intervals and so open sets are translation invariant. 

6. For any m,n, the set {i € J | m([n,n + 1] E;) = +} is finite, and m(E;) > 0 


—_— m 


if and only if m([n,n + 1]/M E£;) > 0 for some n. Oo 


Definition 1030 (Lebesgue Measure). Lebesgue Measure is the unique function 
m:L — [0,00] which is countably additive and satisfies m([a, b]) = b —a for all 
a<b. 


15.4 F, and G; Sets 


Definition 1031. A is called an F, set if it can be expressed as a countable union of 
closed sets, that is if A = U, Ay for some sequence (A,,| 1 € N) of closed sets. 

B is called a Gs set if it can be expressed as a countable intersection of open sets, 
that is if B = ,,B, for some sequence (B,, | n € N) of open sets. 


Problem 1032. A set is F, if and only if its complement is G3, and so a set is Gs if 
and only if its complement is F,. 


Problem 1033. /. Every countable set is an Fg set. 

2. Every open interval is an F, set. 

3. Every open set is an F, set, and so every closed set is a Gs set. 

4. Every open set and every closed set is both an F, set and a Gs set. 

5. The union of countably many F, sets is an Fg set, and the intersection of 
countably many Gz sets is a Gg set. 

6. The intersection of two F, sets is an F, set, and the union of two Gs sets is a 
Gs set. 


While all open sets and closed sets are both F, and Gs, the set Q of rational numbers 
is an F, set which is neither open nor closed. Thus the collection of F, sets (as well 
as the collection of Gs) is strictly larger than the collection of open sets or the closed 
sets. We will see later that the set of rational numbers is not a Gs. 


Problem 1034. Give an example of a set which is both Fz and Gs, but neither open 
nor closed. 
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Problem 1035. Ali F, sets and Gs sets are measurable. 
Problem 1036. The following conditions are equivalent for any A CR. 


I. Ais Lebesgue measurable. 

2. GF has measure zero for some F; F and G3 G with F CACG. 
3. A= FUE for some F, set F and measure zero set E. 

4. A=G~E for some Gs set G and measure zero set E. 


Many important sets in the theory of real functions are F, or Gs; sets. 


Problem 1037. Let f:R — Rand C be the set of points at which f is continuous. 
Show that C is a Gs set. 


[Hint: Let G, := U {(a, b) | For all x, y € (a,b), | f(x) — fQ)| < i} Then C = 
‘ar Gn.] 


15.5 The Baire Category Theorem 


Theorem 1038 (Baire Category, Baire 1899). The intersection of countably many 
dense open Sets is dense. 


Proof. Let G = M,G,, where each G,, is a dense open set. 

The proof will be similar to the proof of uncountability of R in that we will build 
a sequence nested closed intervals J; > J, D>--- D I, D ---, but here at each stage 
n, we will make sure that J,, is contained in G,,. 

We will make use of the fact that every nonempty open set must contain some 
closed interval J = [a,b] witha <b. 

To show that G = ,,G,, is dense, let (a, b) be a nonempty open interval so that 
a <b. It will show that G lM (a, b) is nonempty. 

Since G; is a dense open set, so G; M (a,b) will be nonempty open, and so 
G, 2 (a,b) will contain a closed interval J; = [a;, bi] with a; < b;. Since G2 is a 
dense open set, so G2 M (a1, b;) will be nonempty open, and so G2 M (a1, 51) will 
contain a closed interval I, = [a2, bz] with az < b>. Continuing in this way, we get 
a sequence of closed intervals [,, = [a,, Dy], = 1,2,..., with a, < b, such that 


IK, DhD---DI,D>--- and I, CG, forall n. 


By the nested intervals property there is p € M,/,. But since J, C G, for alln, we 
have p €M,G, = G. Thus p € GN (a,b) and thus GN (a,b) 4 @. Oo 


Corollary 1039 (Baire Category). The union of countably many nowhere dense 
sets cannot contain a nonempty open interval. 


Proof. Let A, A2,...,An,... be nowhere dense sets. Then for each n there is a 
dense open set G,, disjoint from A,,. If (a,b) is any nonempty open interval, then 
since 1,G, is dense (by the theorem), (a,b) contains a point p € M,G, Then 
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P ¢ UnAn, and so U, A, does not contain the interval (a,b). Since (a,b) was 
an arbitrarily chosen interval, it follows that no nonempty open interval is contained 
in U, An. O 


Definition 1040. A set is called meager or of first category if it can be expressed 
as the union of countably many nowhere dense sets. A set is called comeager or 
residual if its complement is meager. 


Thus the Baire category theorem says that a meager set cannot contain a nonempty 
open interval, or equivalently that a comeager set must be dense. 


Problem 1041. The collection of meager sets form a o-ideal. In particular, we 
have 


1. The subset of a meager set is meager. 

2. The union of countably many meager Sets is meager. 

3. Every countable set is meager. 

4. No interval is meager. In particular, R is not meager, and so the complement of 
a meager set cannot be meager. 


Recall that all the conditions of the last problem are satisfied if “meager” is replaced 
by “measure zero.” But the following result shows that these two collections are very 
different. 


Proposition 1042. R (or more generally any interval) can be partitioned into two 
disjoint sets one of which is meager and the other has measure zero. 


Proof. Fix an enumeration of the set Q of rational numbers, say Q = {r,| 1 € N}. 
For each m,n, let Im be any open interval of length 1/2”*” containing ry (e.g., 
say Imn = (Qmn,Omn), Where dm = Tm —1/2"t"*! and ban = fn + 1/2"), 
Now put 


Gs ies “CEST | Ga. ands FS RSG: 


Then G,, > Q, so G,, is a dense open set, so by the Baire category theorem G is a 
comeager dense Gs, and hence its complement F is a meager F,. Also for each m, 
Q Sc G Sc Gin cS On Linn with 


= aa 1 1 1 
do lenUn.n) = » Qm-+n = Qm > Qm a Qm* 


n=1 n=1 n=1 


Thus for each m, G can be covered by a sequence of open intervals of total length 
1/2”. Hence G has measure zero. Thus R = G U F, with G having measure zero, 
and F being meager. oO 


Problem 1043. Every F, set whose complement is dense must be meager. 
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A meager set can be dense. For example, the countable dense set Q of rational 
numbers is both meager and of measure zero. The following is a consequence of the 
Baire category theorem. 


Proposition 1044. The set RXQ of irrational numbers is a Gs set which is not Fg. 
Hence, the set Q of rational numbers is an Fg set which is not Gs. 


Proof. If Rx Q were an F, set, it would be meager since any F, set with dense 
complement must be meager. But that would imply R = (R~Q) U Q is the union 
of two meager sets and hence itself meager, a contradiction. Oo 


Problem 1045. Show (in contrast to the example of Problem 979) that there cannot 
be any function f:R — R such that f is continuous at each rational point and 
discontinuous at each irrational point. 


Problem 1046. Show that ([0, 1] N Q) U ([2, 3]\Q) is neither Gs nor F,. 


Problem 1047. Show that the set of irrational numbers contains a translated copy 
of the Cantor set. 


{Hint: Consider all translates of the Cantor set by rational numbers. ] 


15.6 The Continuum Hypothesis for G; Sets 


In this section we will show that Gs; sets satisfy continuum hypothesis in the sense 
that every Gs set is either countable or has cardinality c. 


Theorem 1048. Every nonempty dense-in-itself Gs set E contains a generalized 
Cantor set and so there is an injective y: {0, 1}N — E with ran(@) a perfect set. 
In particular, every nonempty dense-in-itself Gs set has cardinality c. 


Proof. Let E be a nonempty dense-in-itself set with 
H=()Gy 


where G,, is an open set for each n € N. We will say that the interior of a closed 
interval [a, b] meets a set A if (a,b) N A # @. Then we have: 


Lemma. For every closed interval J whose interior meets F and everyn € N, there 
exist disjoint closed subintervals J and K of J such that J, K C G,, 0 < len(J) < 
1/n, 0 < len(K) < 1/n, and each of the interiors of J and K meets E. 


Proof. Suppose that the interior of J = [a,b] meets E andn € N. Since every 
nonempty open set having nonempty intersection with F contains infinitely many 
points of E and (a,b) M E is nonempty, we can fix u,v € E witha <u<v<b. 
Fix also t, w such thata <u <t<w<v <b. Thenu € G, NA (a,f) and since 
G,, (a, t) is open, we can choose p,q such that u € (p,q) © G,N (a, t). Similarly, 
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we can choose r,s such that v € (r,s) C G, M (w,d). Finally choose p’,q’,r’, s’ 
such that p < p’<u<q'<q,r<r'<v<s' <s,andq'—p',s’—r' <1/n. 
Now putting J = [p’,q’] and K = [r’, s’] we get the conclusion of the lemma. 0 


Now fix c € E,a,b witha < c < b, and put J, = [a,b]. Then /, satisfies the 
condition of the lemma, so there exist disjoint closed subintervals Jp and J, of J, 
such that Jo, J; GC G,,0 < len(J) < 1,0 < len(K) < 1, and each of the interiors of 
Io and J; meets E. Continuing this process and using the lemma repeatedly, we can 
build a family of closed intervals (/,,| u € {0, 1}*) such that for every binary string 
u of length n we have: 


1. I, © G, and 0 < len(/,,) < 1/n. 
2. The interior of /,, meets E. 

3. Tino, 1 Frasl Cc Ty. 

4. Tyo N11 = @. 


The family (7,,| u € {0, 1}*) is thus a Cantor system, and hence determines an injec- 
tive gp: {0, ei — R such that for each infinite binary sequence z = 7Z122°++Z)°°: 
we have 


() Teyzy~2y = ((@}- 


Then ran(g) is a generalized Cantor set (hence perfect). But since J, C G, for 


any binary string of length n, it follows that for any infinite binary sequence z = 
Z1Z2°° Zn 5 


{g(z)} = | ee Cc () Gn = E, 


n 


and so g(z) € E. Hence ran(g) C E, so E contains the generalized Cantor set 
ran(¢). Oo 


Since every closed set is Gs, so every perfect set is a dense-in-itself Gs, and we have 
another proof of Theorem 598 for the case of R: 


Corollary 1049 (Cantor). A nonempty perfect set in R has cardinality c. 


Since the set of rational numbers is a countable dense-in-itself set, Theorem 1048 
gives another proof of the following: 


Corollary 1050. The set Q of rational numbers is not a Gs set, and hence the set 
of irrational numbers is not an Fg set. 


We now have the result that the Gs sets, and therefore the closed sets, satisfy the 
continuum hypothesis. 


Corollary 1051. Every uncountable G3; set contains a generalized Cantor set and 
hence has cardinality ¢. 
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Proof. Let E be an uncountable Gs; set and let C be its set of condensation points. 
By Corollary 971, C is a nonempty dense-in-itself set. By Theorem 970, E~C is 
countable and so is an F, set, and hence C = EX(EXC) is Gs. By Theorem 1048, 
C contains a generalized Cantor set (which is perfect and of cardinality c). oO 


Corollary 1052 (Cantor). Any uncountable closed subset of R contains a gener- 
alized Cantor set and hence has cardinality c. 


We will later show that the continuum hypothesis is satisfied by a larger class of sets 
called analytic sets. 


Note that a set contains a generalized Cantor set if and only if it contains a 
nonempty perfect set. Hence we make the following definition. 


Definition 1053 (The Perfect Set Property). A set is said to have the perfect set 
property if it is either countable or contains a perfect set (or equivalently, contains a 
generalized Cantor set). 

A collection of sets is said to have the perfect set property if every set in the 
family has the perfect set property. 


Thus Corollaries 1052 and 1051 are simply stating that the closed sets and the Gs 
sets, respectively, have the perfect set property. 


15.7 The Banach-Mazur Game and Baire Property 


For each A C R, a two person infinite game of perfect information called the 
Banach—Mazur game G%,* is defined as follows: 

Two players I and II alternately play an infinite sequence of bounded closed 
intervals of positive length with player I going first: 


I I; 


The game G3": NS 7 NS F 
Il In 14 


Rules: 


1. Each [, is a closed interval of finite positive length. 
2. Each player’s move must be contained in the opponent player’s previous move, 
so that we have a nested sequence of intervals: 


q 2hDhDI4Dd-:+D In-| 2 In, 2: 


3. 0 < len(1;) < oo and 0 < len(J, 41) < $len(/,). 


Any play of the game as above therefore defines a unique real number p in the 
intersection of the sequence (/,,) of nested intervals: p € (),, In. 
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Winning conditions: Player I wins the above play of the game G4* if p € A; 
otherwise, Player II wins. 


Problem 1054 (Mazur). Let A C R. Show that in the game G’,* above: 


1. If A is meager then Player II has a “winning strategy,” i.e., Player II can force a 
win no matter how Player I plays. 

2. If there is a nonempty open set U such that U~A is meager then Player I has a 
winning strategy. 


(Hint: Mimic the proof of the Baire category theorem. ] 


Note. Mazur invented the game G4* and proved the above results. He then asked 
if the converses are true. Banach showed that the answer is yes and won a bottle of 
wine as a prize from Mazur. 


Definition 1055 (Baire Property). A set E C R is said to have the Baire property 
if there is an open set U such that AAU = (ANU) U (U~A) is meager. The class 
of all sets with Baire property will be denoted by Y. 


Corollary 1056. If A C R has Baire property, then either A is meager or UA is 
meager for some nonempty open set U. 


Problem 1057. [f A ¢ R has Baire property, then the game G** is “determined,” 
i.e., at least one of the players has a winning Strategy. 


Problem 1058. Every open set and every meager set has Baire property. 


Proposition 1059. Let A be a set with Baire property. If A is non-meager, then A 
contains a perfect set. 


Proof. By Corollary 1056, fix a nonempty open U such that U ~ A is meager, and 
let F be a meager F, set with (U~ A) C F. Then U~F is a Gs; set which must 
be uncountable (if U~ F were countable then U C (UX F) U F would be meager, 
but no nonempty open set is meager). Hence by the perfect set property for G; sets, 
U~F contains a perfect set. But then, A contains a perfect set, since (UF) C A. 

oO 


Proposition 1060. Jf A has Baire property then so does R~A. If a sequence of sets 
A, A2,... all have Baire property, then so does their union U, Ayn. 


Proof. Let A have Baire property. Fix open U with AAU meager. Put V = RXU, 
so that V is open and the boundary set U~U is nowhere dense (and hence meager). 
Now notice that (RX A)AV is contained in (AAU) U (U XU) which is meager, 
hence R~. A has Baire property. 

Next assume that A, has Baire property for each n € N, and fix open U,, such 
that A, AU,, is meager. Then the countable union (_),, (An AU,,) is also meager. Now 
let U := U,, Un. Then U is open and (U),, An) AU © UL, (An AU,), and so U,, An 
has Baire property. Oo 
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By the above results, the collection Y of all sets having Baire property contains all 
meager sets, intervals, open and closed sets, and is closed under taking countable 
unions and complements of sets, and hence under forming countable intersections 
as well. Hence Y also contains all F, and all Gs sets as well. Thus, like the collection 
L of all Lebesgue measurable sets, the collection Y of sets with Baire property is 
also a large collection of sets forming a sigma-algebra (Sect. 18.1). 


Problem 1061. Suppose that A and B are sets with Baire property, and U and 
V are open sets with AAU and BAV both meager. If AM B is meager, then 
GnVv=@D. 


{Hint: By the Baire category theorem, no nonempty open set is meager. Now the 
open set U M V is contained in the meager set (AM B) U(U~A) U (V~B).] 


Corollary 1062 (CCC Property). If (A; |i € 7) are pairwise disjoint sets with 
Baire property, then A; is meager for all but countably many i € I. 


Problem 1063 (Translation Invariance). Jf E has Baire property then so does 
E + p, where E+ p:={x+ p|x € E}. 


An infinite binary sequence (x,) € {0,1}N is called 1-normal if the relative 


frequency of Is among the first n bits approaches the limiting value s, ie., if 


limy—>00 A(t] xX) = ‘. It is a celebrated result of Borel that the set Nj of all 
x € [0, 1] which admit a 1-normal binary representation has measure 1. 


Problem 1064. Show that N, is meager. 


[Hint: “A dyadic interval (SS. x) fixes the first  bits.”] 


15.8 Vitali and Bernstein Sets 


Vitali Sets 


Definition 1065 (Vitali). A set V C R is said to be a Vitali set if the following 
conditions hold: 


1.(V+r)N(WV +s) = @ for all distinct rational numbers r # s (in Q). 

2. UregV +r) =R. 

Theorem 1066 (AC). Vitali sets exist. 

Proof. Recall the equivalence relation on R defined by x ~g y & x — y € Q. The 
corresponding partition R/Q of R consists of sets of the form a + Q := {a+ r| 
r € Q}, any two of which are either identical or disjoint. Now note that V is Vitali 


set if and only if it is a choice set for the partition R/Q. Hence we obtain a Vitali 
set using AC. oO 
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Theorem 1067. A Vitali set cannot be Lebesgue measurable and cannot have Baire 
property. Hence there are subsets of R which are neither Lebesgue measurable nor 
have the Baire property. 


Proof. Let V be a Vitali set. 

Suppose that, if possible, V is measurable. By translation invariance we have 
m(V +r) = m(V) for all r, and R is a countable union of sets of the form V + r, 
hence V cannot have measure zero. 

Hence there exist a < b such that m([a,b] NV) > 0. Put W := [a,b] NV. 
Then (W +r|r €QN (0, 1)) is a family of pairwise disjoint measurable sets all 
having the same positive measure m(W), hence by countably additivity of m the 
union | ){W +r| r € QN (0, 1)} has infinite measure. On the other hand this union 
is contained in (a,b + 1) and so has finite measure < b + 1 — a. We thus get a 
contradiction. 

We next show that V cannot have Baire property. Suppose, if possible, there is an 
open set U such that V AU is meager. Then U is nonempty since if U were empty, 
then V and so V + r would be meager for all 7, and R would be a countable union 
of meager sets, which is impossible. 

Hence U is nonempty open, and we can fix a < b with (a,b) C U. Put W := 
(a,b) OV. Then (a,b)~W is meager. Now fix any rational 0 < r < b—a. Then 
W1(W +r) = @, so by Problem 1061, (a,b) N (a +7r,b +r) = @, whichis a 
contradiction since t(a tr+b)€(a4,b)N(a+r,b+7r). Oo 


Feferman showed that the existence of a Vitali set cannot be proved without using 
the Axiom of Choice, and even if the full use of AC is allowed, no effectively defined 
set can be proved (without additional axioms) to be a Vitali set. 

Note that the main property of Lebesgue measurable sets and sets having Baire 
property that was used in the above proof is translation invariance. We next show a 
very different method for obtaining non-measurable sets. 


Bernstein Sets 


Definition 1068 (Bernstein). A set B is said to be a Bernstein set if neither B nor 
its complement R~ B contains any nonempty perfect set. 


Theorem 1069 (AC). Bernstein sets exist. 


Proof. By the Axiom of Choice every infinite cardinal is an aleph, and so we can 
fix a such that ¢c = Ny. Since there are exactly ¢ nonempty perfect sets, we can 
enumerate them as (P, | v < @,). Fix a choice function g for the collection of all 
nonempty subsets of R. Now define, by transfinite recursion, two sequences of reals 
(ay | v < @q) and (b,| v < @,) as follows. 


ay := p(Pywfag, be | E<v}), by = G(Py~(fav} U {ag, be | & < vp). 
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The elements a, and b, are well defined since for each v, the perfect set P, has 
cardinality ¢ while the sets being removed from P,, have cardinality < c. Also we 
have a, # b, forall u,v < ay. 

Finally, put A = {a,| v < my} and B = {b,| v < wy}. Then AN B = @, with 
a, € P,N Aandb, € P, 2 B for all v < wy. Thus every nonempty perfect set has 
nonempty intersection with each of the disjoint sets A and B, and so both A and B 
must be Bernstein sets. Oo 


Let B be a Bernstein set. By Propositions 1024 and 1059, both AN B and ANB 
are non-measurable for any measurable set A of nonzero measure, and both AM B 
and A~ B fail to have Baire property for any non-meager set A with Baire property. 
This gives a stronger result than Theorem 1067: 


Corollary 1070 (AC). Every set not of measure-zero has non-measurable subsets. 
Every non-meager set has subsets without the Baire property. 

Consequently, there are measurable sets without Baire property and there are 
sets with Baire property which are non-measurable. 


Notice the highly non-effective way of obtaining Vitali and Bernstein sets. In both 
cases, an uncountable number of choices were essential. 

It turns out that this is unavoidable: No non-measurable set can be proved to 
exist using only countably many choices. By a famous result of Solovay, one can 
consistently assume that all sets are Lebesgue measurable and that the Axiom 
Dependent Choices holds (assuming that the usual axioms are consistent with the 
existence of an inaccessible cardinal). So any proof that non-measurable sets exist 
will need uncountably many choices. 

Vitali and Bernstein sets are relevant to the question if Lebesgue measure can be 
extended to a measure which is defined for all sets of reals: 


The Measure Extension Problem (Lebesgue). Does there exist a countably addi- 
tive function 4: P(R) — [0, co] which extends Lebesgue measure ? 


This question was fully analyzed by Ulam, which opened up the field of large 
cardinal numbers. But this is a topic for Postscript HI (Chapter 19), where we will 
present a detailed account of Ulam’s work. 


Chapter 16 
Cantor—Bendixson Analysis of Countable 
Closed Sets 


Abstract We devote this chapter to the Cantor—-Bendixson analysis of countable 
closed sets. We first prove the effective Cantor-Bendixson theorem which decom- 
poses a closed set into an effectively countable set and a perfect set. We then obtain 
a full topological classification for the class of countable closed bounded subsets of 
R: The Cantor—Bendixson rank is shown to be a complete invariant for the relation 
of homeomorphism between these sets, and the countable ordinals w’n +1 (v < @, 
n € N) are shown to form an exhaustive enumeration, up to homeomorphism, 
of the countable closed bounded sets into &; many pairwise non-homeomorphic 
representative sets. 


Countable closed sets arose in Cantor’s study of trigonometric series. The analysis 
requires iterations of derived sets (i.e., orders of limit points) into the transfinite, and 
it naturally leads to the notion of transfinite ordinals. Roughly speaking, this is how 
Cantor was led to his creation of ordinal numbers, and went on to eventually create 
set theory. We will briefly discuss the background in Sect. 16.4. 


16.1 Homeomorphisms of Orders and Sets 


Homeomorphic Order Types 


A bijection f from an order X onto an order Y is a homeomorphism if both 
f and its inverse f~! are continuous. Two orders are homeomorphic if there is 
a homeomorphism between them. Clearly, isomorphic orders are homeomorphic. 
If X and X’ are isomorphic orders and Y and Y’ are isomorphic orders, then X is 
homeomorphic to Y if and only if X’ is homeomorphic to Y’. Hence we can speak 
of homeomorphic order types. 
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Problem 1071. [f X and Y are orders, then f: X — Y is ahomeomorphism if and 
only if f is a bijection and for every A © X and p € X, p € D(A) in X if and 
only if f(p) € D(f[A]) in Y. 

Problem 1072. Show that 


. @ is homeomorphic to €. 

. @ +n is homeomorphic to w + 1 foranyn EN. 

@ + 1+ * is homeomorphic to w + 1. 

¢€ + | not homeomorphic to w + 1. 

. 1+ Lis homeomorphic to n. 

1 +A is homeomorphic to X + 1. 

n + 1 is homeomorphic to n, but A + A is not homeomorphic to i. 


NAUKRWNS 


Problem 1073. Ifa, B, etc are order types, then: 


1. a is homeomorphic to a*. 

2. If a and B both are order types of orders with endpoints, then a + B is 
homeomorphic to B + a. 

3. If I is an ordered set and if for each i € I a; and B; are order types 
with endpoints with a; homeomorphic to B;, then )°; <1; 0; is homeomorphic to 


ier Bi. 


Homeomorphisms Between Subsets of R and Orders 


A subset Y C R is said to be homeomorphic to an order X if there is a bijection f 
from X onto Y such that for every A C X and p € X, p € D(A) in X if and only 
if f(p) € D(f[A]) in R. If Y C Rand X and X’ are isomorphic orders, then Y 
is homeomorphic to X if and only if Y is homeomorphic to X’. Hence we can talk 
about a subset Y of R being homeomorphic to an order type a. 


Problem 1074. Let A = {0, i, 5, seals se ...} U {2}, so that the order type of 
A is w + 1. Show that A is not homeomorphic to w + 1, and in fact that A is 


homeomorphic to w. 


The above problem shows that if A C R has order type a, then A may fail to be 
homeomorphic to a. The reason for the failure in the above is easily found: While 2 
is a limit point of A when the suborder A is considered as an order by itself, 2 is not 
a limit point of A as a subset of R. 


Problem 1075. Show that there is a subset of R having order type n which is 
homeomorphic to w. Conclude that for any infinite countable order type a there 
is a subset A of R having order type a which is homeomorphic to w. 


Recall that A C R is said to be continuously order-embedded if whenever p € A is 
a limit point of E C A when A is considered as an order by itself, then p is a limit 
point of A in R. Then we have 
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Proposition 1076. If A C R is continuously order-embedded in R and the order 
type of A is a, then A is homeomorphic to a. 


Recall also that there are two important cases when A C R can be guaranteed to be 
continuously order-embedded in R: 


1. If A is closed, then A is continuously order-embedded in R. (Theorem 593) 
2. If every point of A is a two-sided limit point of A, then A is continuously order- 
embedded in R. (Theorem 532) 


Problem 1077. Give an example of a subset A of R which is dense-in-itself (every 
point of A is a limit point of A) but which is not continuously order-embedded in R. 


Problem 1078. Let X be an order and let f:X — R be a strictly increasing 
function which continuously order-embeds X in R. Then the image f [|X] is closed 
and bounded in R if and only if X is a complete order with endpoints. 


{Hint: Use Theorem 593.] 


16.2. The Cantor—Bendixson Theorem and Perfect Sets 


Theorem 1079 (Cantor—Bendixson). Let A be any nonempty closed subset of R 
and for eacha < @, define 


Fy, := D(A) = the a-th iterated derived set of A. 


Then 


1. (Fa | @ < @1) are decreasing closed sets so that 


Fo > Fi D-)D F, 2 Fyai-s: Fy 2D Fos D+ Fa D Foti 2s: 
2. The set 


A= U (Fa Foti) = Fox () Fy 


a<ay a<aj 


is countable, in an effective fashion. In particular, for each a < a, the set 
Fy~ Fy+41 is countable. 

3. There exists a least 4 < @, such that Fy,4,; = Fy, and so Fy = F, for all 
a> wp. 

4. For the ordinal ju above, F, is either empty or nonempty perfect (hence 
uncountable), and so if A is countable then F,, = ©. 

5. If A is countable and bounded then the ordinal {1 above is a successor ordinal. 
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Proof. 1. This is obtained by transfinite induction as follows. Fy = A is given to be 
closed. For any set E, D(E) is always closed and if E is closed then E > D(E). 
Thus if Fy is closed, then Fy4; = D(F,) is a closed subset of Fy. Finally, since 
the intersection of any family of closed sets is closed, therefore if @ is a limit 
ordinal then Fy = Mg<a Fg is a closed subset of Fg for each B < a. 


2. Let B be the family of all open intervals with rational endpoints. Since B is 
countable, we can enumerate it as 


B= {V,|n €N}. 


For each x € H, fix the unique a = a, such that x € Fy, ~ Fo,+41, and then 
(noting that x € Fy, ~ D(Fy,.) is an isolated point of Fy, ) choose the least n = 
nx € N such that V, N Fy, = {x}. This defines a mapping x +> n, from H into 
N. We claim that this mapping is injective. To see this, suppose that x # y are in 
H. Now if a, < ay then V,, 9 Fo, +1 = @ but Vz, 9 Fo,+1 2 Vay 1 Fa, FO, 
and son, A ny. Ifa, > ay then similarly n, € ny. Finally, ifa, =a, =a 
(say), then V,,.. Fa = {x} # ty} = Va, O Fo, so again we haven, # ny. Thus 
the mapping x +> n, from H into N is injective and so H is countable. 


3. If we had Fy ~ Fy+1 4 @ for all a < a), the set H would be uncountable 
(because it would then be the union of w;-many pairwise disjoint nonempty sets), 
a contradiction. Hence Fy. F,4, = @ or Fy = Fy+, for some qa, and we can fix 
Lt to be the least such a. 


4. This is immediate since F,,4; = D(F,,) and a nonempty set E is perfect if and 
only if D(E) = E. 


5. If A = Fo is nonempty, countable, and bounded, note first that jz is the least 
ordinal such that F,, = @. Since Fo is nonempty, so ~ > 0. Finally w cannot 
be a limit ordinal since if F, is nonempty, closed, and bounded for each a < ju, 
then by the Heine—Borel Theorem MN, <,, Fy would be nonempty as well. Hence 
4 is a successor ordinal. 

oO 


A consequence of the proof of the theorem is that a countable closed set is effectively 
countable, that is, each countable closed set A determines a unique effectively 
defined injection from A into N. More generally, we have: 


Corollary 1080 (The Cantor-Bendixson Theorem). Every closed set is the union 
of an effectively countable set anda set E with D(E) = E. 


Corollary 1081. Every uncountable closed set contains a nonempty perfect set. 
Hence closed sets “satisfy the continuum hypothesis,” that is, if A is closed then 
either |A| < Xo or |A| = 2%. 


Corollary 1082. If A is a nonempty countable closed bounded set, then there is a 
unique v such that D(A) 4 © but D°*”(A) = @. In this case, D(A) is a 
nonempty finite set. 
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Proof. The first statement follows from the last two parts of the theorem. For 
the second statement, note that if D‘? were infinite. then being bounded it 
would have a limit point by the Bolzano—Weierstrass theorem, which would imply 
D°+)(A) ¥ @, a contradiction. o 


Recall that a point p is said to be condensation point of a set A if every open 
interval containing p contains uncountably many points of A. 


Problem 1083. Let A be a nonempty closed set, and put Fy = D(A) and H = 
Ue<w, (Fa Foti) as in Theorem 1079. Also put P = Qe<w, Fy. Show that 


1. H consists precisely of the non-condensation points of A and P consists 
precisely of the condensation points of A. 
2. Assuming P # @, show that P is the largest perfect set contained in A. 


From the above problem it follows that in a closed set A, all except countably many 
points of A are condensation points of A. 


16.3. Ordinal Analysis of Countable Closed Bounded Sets 


Using Corollary 1082 we can make the following definition. 


Definition 1084 (CB-rank, Cantor—-Bendixson rank). If A is a nonempty count- 
able closed bounded set, we define its CB-rank (Cantor—Bendixson rank) to be the 
pair v,n where v is the unique ordinal such that D™ (A) 4 @ but D’*!)(A) = @, 
andn = |D“)(A)|. 


Thus the CB-rank of a nonempty countable closed bounded set A equals v, if and 
only if D“)(A) is a nonempty finite set with n elements. 


Problem 1085. Let A C R be a closed bounded set with exactly one limit point. 
Show that A is countable and must be homeomorphic to @ +1. What are the possible 
order types for A? 


Proposition 1086. Let A and B be homeomorphic countable closed bounded 
subsets of R. Then A and B have identical CB-ranks. 


Proof. Let f: A > B bea homeomorphism of A onto B. Since A and B are closed, 
we have D(A) C A, D(B) C B. Since f is a homeomorphism, we have 
f[D(A)] = D(B). More generally, by a routine transfinite induction we have 
f[D™(A)] = D™ (B) for all ordinals @, and the result follows. Oo 


Proposition 1087. Jf a is an infinite successor ordinal, then a is homeomorphic to 
w'n + 1 for some ordinal € > 0 and somen €N. 
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Proof. By Cantor Normal Form, we have 


b> bo > ee > &, 


ny,N2,...,nk EN. 


a=o'n, to%n +--+ abn, kEN, 


Since q@ is an infinite successor ordinal, we must have ¢, = 0 (so that aw = 1) and 
k > 2, so we can write 


a =o 'ny + 08n. + 08n3 +++ + 8! ng_y +k 
= o''n; + (1+) 
= (wn; +1)+ 8, 


for some ordinal 6 < w‘! which must be either zero or a successor ordinal (since a 
is a successor ordinal). 

If 8 = 0, thena = w'n, + 1 and we are done. Otherwise, B is a successor 
ordinal, and since jz + v is homeomorphic to v + y for successor ordinals jz and v, 
it follows that a is homeomorphic to 


B+ (ng +1) =0%n, +1, 


where the last equality holds since B < w5!, n; > 0, and w*! is a remainder ordinal 
and so “absorbs any smaller ordinal as a summand from the left.” Oo 


Corollary 1088. Let X be a well-order with a last element. Then X is homeomor- 
phic to the order W(w5n + 1) = {a| a < $n} for some ordinal € and some 
neéeN. 


Definition 1089. Let (/,| €N) with J, = [an,b,] be a sequence of closed 
intervals, with a, < b, for each n € N. We say that the sequence of intervals 
(I, | n € N) converges to a point p € Rand write (J,,|n ¢ N) — p if for any a,b 
with a < p <b there is m € N such that J, C (a,b) for alln > m. 


Proposition 1090. Let I, = [ay,b,] and Jn = [cn, dn], n € N, be closed intervals 
such that dy < by and Cy < dy, for each n € N. Assume that the sequence 
Let (I,|n €N) and (J,|n €N) each be a pairwise disjoint sequence of closed 
intervals, and let A, © (dn,b,) and By © (Cn, dn) for each n € N with A, 
homeomorphic to B, for each n € N. Suppose that p,q € Rwith p ¢ I, and 
q ¢ J, for any n, and that (I,|n € N) > p and (J,|n € N) > q. Then the sets 


A:= |J 4, U {p} and B:= |_) Bn U fq} 


neN neN 


are homeomorphic. 
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Proof. For each n, fix a homeomorphism (f,): A, — B, (using AC), and define 
f:A — B by setting f(p) = g and f(x) = f(x) if x € An. Clearly f is a 
bijection from A onto B. 

To show that f is continuous, suppose that (x,) > x in A. 


Case 1: x # p. Then x € A, for some m. Since Am © (Gm, Bm), 80 dm < xX < 
bm and hence there is k such that x, € (am, bm) for alln > k. Then x, € Am 
for alln > k, and thus (x, | > k) is a sequence in A,, converging to x € Am. 
Since fin: Am —> Bm is continuous, the sequence ( fin(x,)| 1 => k) converges to 
Fn (x) in By. 

Case 2:x = p. In this case we show that ( f(x,)| n € N) converges to f(p) = 
q. Suppose that c < q < d. Since the sequence of intervals (J,,) converges to 
q, there is m such that J, C (c,d) for alln > m. Since p ¢ UnemIn, we can 
choose a,b witha < p < b such that J; N (a,b) = @ for j < m. Since (xy) 
converges to p, there is k such that x, € (a,b) forn > k. Then for any n > k, 
xX, ¢ 1; for any j < m, so for alln > k either x, = p or x, € Aj; for some 
j = m. Hence for any n > k we have f(x,) = q or f(x) € B; for some 
j =m, which implies f(x,) € (c,d). This shows that ( f(x,)) converges to 


S(P) =4. 


Thus f is continuous. Similarly f~! is continuous. oO 


Proposition 1091. Let ([a,,b,]| n € N) be a sequence of pairwise disjoint closed 
intervals converging to p, where p ¢ Un|an, bn], and suppose that for eachn A, 
is a countable closed set contained in (ay, by,) with the CB-rank of Ay being Qn, Kn. 
Let @ = sup, @,, and A = U, A, U {p}. 


1. If a, = a@ for infinitely many n, then the CB-rank of A isa + 1,1. 
2. If, <a forall n so that a = sup,, &, is a limit ordinal, then the CB-rank of A 
isa, 1. 


Proof. For the first part, note that we have D(A,) is finite for all n and is 
nonempty for infinitely many n, hence p € D(A) and so D(A) is a closed 
set of order type w + 1 with greatest element p. Therefore D‘+!)(A) = {p}. 

For the second part, note that D(A,) = @ for all n, but for every B < a we 
have D')(A,,) # @ for infinitely many n, and so p € D“*)(A) for all 6 < a. Hence 
Pp € Np<aD)(A) = D™ (A). It follows that D(A) = {p}. Oo 


Corollary 1092. Let 0 < a < a, n € N, and let E be a closed and bounded 
subset of R having order type w“n + 1. Then the CB-rank of E is a,n. 


Proof. First note that since w%n + 1 = (o@* + 1)+ (@* +1) +--+ + (@* + 1) (with 
n summands), so E can be partitioned into n closed sets FE; < Ey <--- < E, with 
the order type of each FE, being w* + 1. Since for disjoint closed sets A and B we 
have D‘)(A U B) = D(A) U D®)(B), it suffices to show that the CB-rank of 
@* +1is a, 1. But this follows from the previous proposition by a routine transfinite 
induction argument, since if a is a successor ordinal witha = 6+ 1 thenw*+1 = 
>, (8 + 1) + 1, and if @ is a limit ordinal then @* + 1 = >, (@% +1) +1 where 
a, <a for alln with sup, &, = a. oO 
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Proposition 1093. For all0 < a < a, andn EN, there is a closed and bounded 
subset of R having order type wn + 1 and hence having CB-rank a, n. 


Proof. Recall that every countable order X can be continuously order-embedded 
in R (Theorem 551), and that the embedded image is closed and bounded in R if 
X is complete and with endpoints (Problem 1078). Since w%n + 1 is a countable 
complete order with endpoints, the result follows. 

(Alternatively, one can inductively build a continuously order-embedded set A C 
R of order type w® + 1 (a < @) by taking subsets Ay < Ap < ++: < A, <--- and 
ordinals a,, € N, such that A, is a closed set of order type w®" +1, sup, (@, +1) = 
a,and A = U,,A,, U {p} where p = sup U,, Ay.) oO 


Theorem 1094. Every nonempty countable closed bounded subset A of R is 
homeomorphic to a countable successor ordinal. 


Proof. The result is obvious if A is finite, so assume that A is infinite. 

It suffices to show that every infinite countable closed bounded set is homeomor- 
phic to a well-ordered countable closed set with a greatest element. 

The proof will be by induction on the Cantor—Bendixson rank of A. 

Let v,n (v > 0,7 € N) be the CB-rank of A and suppose that the result is true 
for all sets having CB-rank jz,m with 4 < v. We first do the proof for the case 
n = 1. Then D) = {p} for some p € R. Since the set A is nowhere dense, for 
each x < y wecan choose a,b with x < a < b < y and [a,b] 1 A = @. Hence we 
can choose sequences (a;,), (bn), (Cn), (dn) and sequences of sets (A,), (C,) such 
that 


1. ay < Dy <+++dy < by <0 < DD << Oy < dy <0 <0, < dh; 
2. sup, dn = sup, bd, = p and inf, c, = inf, d, = p; 

3. An © (Gn, by) and Cy, € (Cn, dn) are closed sets; 

4. A =(U,An) U {p} U (Un Bn). 


To do this, start with a; < inf A and d, > sup A, and choose b;, a2 such that 
max(a;,p—1) <b, <a. <p and [b},aq|NA=@. 
Next choose b>, a3 such that 
max(a2, p— 1/2) <by<a3<p and [h,a,])N A=, 
and so on. Similarly complete the sequences (c,) and (d,). Then put A, = 
(Gn, bn) OA = [dn, by] OA, and Cy = (Cn, dn) O A = [Cn, dn] NA, so that A, and 
C,, are closed subsets A, with A = (U,,A,) U {p} U (U, By). 


Now put, for eachn € N, 


Tpn—1 = [an Pn], Ton = [Cn dn] and Eon-1 = An, Eon _ Ch. 
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Then note that the sequence (J,,) is a pairwise disjoint sequence of closed interval 
converging to p with p ¢ I, for any n, and that A = U, E,, U {p}. 

Now for each n, the closed countable set E,, has CB-rank 4, m for some ,m 
with 2 < v, since otherwise D“)(E,) would be nonempty, and so D(A) will 
contain a point distinct from p, a contradiction. Hence by induction hypothesis each 
E,,, £ nonempty, will be homeomorphic to a countable successor ordinal. Since each 
countable successor ordinal can be continuously order-embedded in any interval as 
closed set, we can choose a well-ordered countable closed subset F) of [0, 5] such 
that EF is homeomorphic to F;. Similarly, choose a well-ordered countable closed 
subset F of (3, 3] such that Fy is homeomorphic to F). In general, choose a well- 
ordered countable closed subset F,, of [2S 20-1) such that E,, is homeomorphic 
to F,. Being bounded and closed, each set F,,, if nonempty, must have a greatest 
element. Finally put: 


F:= |) Fa, U {4}, 


néeN 


which is a well-ordered countable closed set with the greatest element 1. Then 
by Proposition 1090, The set A is homeomorphic to the set F’, and therefore to 
a countable successor ordinal. 

If D(A) has n elements with n > 1, then we can order the elements of D(A) 
as D\)(A) = {p, < pz < +++ < pn}, and then choose elements a1, d2,...,@, and 
b,, b2,...,b, such that 


ay < py < by < a2 < po < by < +++ < Ay < Pn < bn, 
a, <inf A, b, > sup A, and 
[b},a2]. N A = ©, [b2,a3] N A = @, 15S [bn—1,4n] NA = @. 


Putting Hy = ANfax, bp] = AN(ax, by) foreachk = 1,2,...,n, we note that each 
H,, is a countable closed set with D™ (Hy) = {px}, hence Hx is homeomorphic 
to a countable successor ordinal a,. Thus A = U_, Hx is homeomorphic to the 
countable successor ordinal @ + a@2 +-+++ Gp. oO 


Corollary 1095. Let A be a countably infinite closed bounded set with CB-rank 
v,n, withO < v < @ andn EN. Then A is homeomorphic to the ordinal w’n + 1, 
and hence to any closed subset of R having order type w’n + 1. 


Proof. By the last theorem, A is homeomorphic to a countably infinite successor 
ordinal a. By Proposition 1087, a is homeomorphic to wm + 1 for some p,m. 
Hence A and w“m-+1 are homeomorphic, and so they must have identical CB-ranks. 
Therefore 4 = v andm =n. Oo 
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We thus arrive at our main result: 


Corollary 1096 (Classification of countable closed bounded sets). Consider the 
series of countably infinite successor ordinals of the form: 


won+1, O<v<a,neEN. 


Every countably infinite closed bounded set is homeomorphic to one and only one 
of the ordinals above. Conversely, for eachO < v < a, andn € N there exists a 
countably infinite closed bounded set having order type w’n + 1. 

Hence the above series gives a complete enumeration, up to homeomorphism, of 
all countably infinite closed bounded sets into &, many pairwise non-homeomorphic 
sets. 


Remark. Although we are dealing exclusively with sets of reals, the above result 
can be stated in the more general context of topological spaces. We will not define 
topological spaces or the relevant related notions, but the general statement is that 
every countable compact Hausdorff space X with |X| > 2 is homeomorphic to 
a’n + 1 for some unique a < @, andn €N. This follows immediately from the 
above result since every countable compact Hausdorff space is homeomorphic to a 
subset of R. (An alternative proof that every countable compact Hausdorff space is 
homeomorphic to a countable ordinal can be given using the Sorgenfrey topology 
on the reals.) 


16.4 Cantor and Uniqueness of Trigonometric Series 


A trigonometric series is a series of the form 


Co 
ao + ye An cosnx + by sinnx, 

n=1 
which may or may not converge for a given value of x € [0, 27]. If the above series 
converges for all x, it defines a periodic function. A special type of trigonometric 
series are the familiar Fourier series, where the coefficients a, and b, are given 
asd, = + o* f(t) cosnt dt and b, = 4 o f(t) sinnt dt (n = 1,2,...) for 
some function f integrable on [0, 27]. However, there are everywhere convergent 
trigonometric series which are not Fourier series. 

Cantor’s work began with the uniqueness problem for trigonometric series, 
which asks this: If two trigonometric series converge and agree everywhere then will 
they necessarily have identical coefficients? More precisely, if for all x € [0,27] we 
have 


Co Co 
do + as cosnx + by, sinnx = co + eG cosnx + d, sinnx, 
n=1 n=1 
then does it follow that a, = c, and b, = d, for all n? 
Cantor’s first important result on the uniqueness problem was that the above is 
indeed true. 
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Theorem (Cantor 1870). If for all x © [0,2z] the series on both sides of the 
equation displayed above converge and are equal, then a, = c, and b, = d, for 
all n. 


Cantor then continued to work on extending the result to the case where the 
hypothesis is weakened to allow for an “exception set” E C [0,27] on which the 
two series may not agree (or fail to converge). In other words, given E C [0, 27], 
the uniqueness problem for an exception set E asks: 


If two trigonometric series agree outside E (that is, if the above equality holds for 
all x € [0,22]~ E), then do they necessarily have identical coefficients ? 


If the answer to this question is yes, we say that the exception set FE is a set of 
uniqueness. Thus Cantor’s first result above says that the empty set E = @ is a set 
of uniqueness. 

Cantor next established that E is a set of uniqueness if E is an arbitrary finite 
set. He went on to point out that F is a set of uniqueness if F is any closed set with 
a finite number of limit points (that is, D(£) is finite), or if E is a closed set whose 
set of limit points in turn have only finitely many limit points (that is, D®(E) is 
finite), and so on for any finite number of iterations of orders of limit points. In our 
terminology, Cantor essentially established the following. 


Cantor Uniqueness Theorem (Cantor 1872). If E © [0,2z] is closed and of 
“finite Cantor-Bendixson rank,” that is if D"(E) is finite for some n € N, then E 
is a set of uniqueness. 


The next natural extension would be to consider limit points of order w, and ask 
if E is a set of uniqueness when D“)(£) is finite; and one can proceed further 
through the transfinite ordinals. This is indeed true, and in fact any countable 
closed set is a set of uniqueness, but that was proved much later. When Cantor was 
investigating the uniqueness problem, notions such as “transfinite ordinal number” 
and “countable set” did not exist, and after establishing his theorem stated above, 
Cantor created and developed such foundational concepts almost single-handedly, 
giving birth to set theory. Thus Cantor’s quest for generalizing his uniqueness results 
led him to consider transfinite iterations of the operation of forming the derived set 
(the set of limit points of a set), and then on to far-reaching abstractions such as 
countable and uncountable sets, the topology of real point sets, the theory of orders, 
order types, well-ordered sets, transfinite ordinals, and cardinals. 

Busy in his creation and development of the theory of the transfinite, Cantor 
never returned to the problem of uniqueness of trigonometric series. Characterizing 
the sets of uniqueness turned out to be an extremely difficult problem, and research 
has been continuing on it for more than hundred years. Interestingly, the use of set 
theory and transfinite ordinals in the investigation in the problem of uniqueness has 
returned, through an area of set theory known as Descriptive Set Theory. 

For more details on the connection between Cantor’s creation of set theory and 
the problem of uniqueness of trigonometric series, we refer the reader to Dauben’s 
book [9] and the article of Kechris [39], where further references can be found. 


Chapter 17 
Brouwer’s Theorem and Sierpinski’s Theorem 


Abstract In this chapter we apply the theory of orders from Chap. 8, especially 
Cantor’s theorem on countable dense orders, to prove two classical theorems: 
Brouwer’s topological characterization of the Cantor set, and Sierpinski’s topologi- 
cal characterization of the rationals. 


17.1 Brouwer’s Theorem 


The Cantor set is an example of a perfect bounded nowhere dense subset of R. 


Problem 1097. Show that if E © Ris homeomorphic to the Cantor set then E is 
perfect bounded and nowhere dense. 


{Hint: The Bolzano—Weierstrass theorem and the Intermediate Value Theorem may 
help.] 


Theorem 1098 (Brouwer). Any two perfect bounded and nowhere dense subsets 
of R are homeomorphic to each other, and hence to the Cantor set. 


Since two closed subsets of R having the same order type are homeomorphic, 
Brouwer’s theorem follows from the following stronger result. 


Theorem 1099. Let E be a perfect bounded and nowhere dense subset of R. Then 
there is an order-isomorphism of R onto R which maps E onto the Cantor set. 


Proof. Let G = R~E so that G is open, and hence G is the union of a unique 
countable family of pairwise disjoint nonempty open intervals. Since EF is bounded, 
there are two unbounded open intervals in the decomposition of G, one of the form 
(—oo, a) and another of the form (b, co) where a = inf E and b = sup E. All other 
open intervals in the decomposition of G are bounded. Let C be the countable family 
of all bounded open intervals in the decomposition of G. Since the intervals in C 
are pairwise disjoint, the family C is naturally ordered, where for intervals 7, J € C 
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we have J < J if and only if x < y forall x € J and y € J. Since E is nowhere 
dense, the ordering of C is dense order without endpoints. 

Similarly if D is the family of bounded open intervals removed in the construc- 
tion of the Cantor set, we find that using the ordering on the intervals, D forms a 
countable dense order without endpoints. 

Hence, by Cantor’s theorem on countable dense orders without endpoints, there 
is an order isomorphism ¢ from C onto D. Foreach J ¢€ C there is unique increasing 
linear function f; mapping the interval J onto the interval ¢(/). Also, let fo denote 
the unique translation map x + x—a mapping the interval (—oo, a] onto the interval 
(—oo, 0] and let f, denote the unique translation map x +> x — b + 1 mapping the 
interval [b, oo) onto the interval [1, oo). Combining all the mappings f7 (J € ©), 
fo and fi, we get an order preserving bijection {* mapping the set G onto the 
complement of the Cantor set. Now note that since FE is nowhere dense, for each 
x € E andany u,v with u < x < v, there exist s,t € G withhu<s<x<t<vso 
that f* is defined at s and t. The same is true for the Cantor set and its complement. 
Moreover R is a complete order. Hence f* extends uniquely to an order preserving 
bijection f mapping R onto R, which can be defined as 


f(x) := sup{ f*(t)| t < x,t € G} = inf{ f*(t)| t > x,t © Gh. 


Clearly, f is then an order-isomorphism of R onto R which maps E onto the 
Cantor set. Oo 


We had seen that every generalized Cantor set (i.e., any set generated by a Cantor 
system) is bounded, perfect, and nowhere dense. We now have the converse result. 


Corollary 1100. Let E be a bounded perfect nowhere dense set. Then E is 
a generalized Cantor set, that is, there is a Cantor system of intervals which 
generates E. 


Proof. By the theorem, we can fix a bijective order-isomorphism f:R — R which 
maps the Cantor set onto F. Let (J,,| u € {0, 1}*) be the standard Cantor system 
which generates the Cantor set. We show that (f[J,]| u € {0,1}*) is a Cantor 
system which generates E.. 

Since f is an order-isomorphism and J, is a bounded proper closed interval, 
so f [J,] is a bounded proper closed interval, for any u € {0,1}*. Since f is a 
bijection and [,-9 N Iy-, = @, hence f[Ii-o] N flue] = @. Again, since f 
is a bijection, if b € {0,1}N and x is the unique member of the singleton Ny Jp\n, 
so f(x) is the unique member of the singleton N,, f [J;|,]. Thus E is generated by 
(f Un]| u € {0, 1}*). Finally, for any b € {0, 1}§, we have lim, len(f [Zpjn]) = 0, 
since if the intersection of a nested sequence of intervals results in a singleton, 
then the lengths of the intervals in the sequence must approach zero. Hence 
(f Uu]| uv € {0, 1}*) is a Cantor system. Oo 
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Corollary 1101. A set is bounded perfect nowhere dense if and only if it is a 
generalized Cantor set generated by some Cantor system of intervals. 


Thus a perfect bounded nowhere dense not only is order-isomorphic and homeo- 
morphic to the Cantor set, but also has the same structure in terms of definition via 
interval trees. So any perfect bounded nowhere dense subset of R (equivalently any 
generalized Cantor set) will be called a Cantor set. 

Note the difference between the term “a Cantor set” (any perfect bounded 
nowhere dense subset of R) and the term “the Cantor set” (the specific subset K 
of [0, 1] obtained by repeatedly removing middle-third open intervals). 


17.2 Homeomorphic Permutations of the Cantor Set 


We will now introduce some especially nice homeomorphisms of the Cantor set 
onto itself. 

Recall the natural bijection F between the set 2N := {0, 1}N of all infinite binary 
sequences and the Cantor set K given by the mapping 


— 2b 
(b1, bo,...,0n,...) Fe F((bn)) = >> a 
n=1 


Using this bijection, we will identify 2X with the Cantor set. This means that 
elements of 2 will represent points of the Cantor set, subsets of 2% will represent 
subsets of the Cantor set, and so on. 

Now consider the subset K; := [0, i] U [2, 1] of [0, 1]. The two closed intervals 
of K, are translates of each other, and using these translations we can get a 
homeomorphism of F; onto itself which interchanges these two intervals. More 
precisely, this homeomorphism /f; can be defined as: 


x+2  ifxe [0,4], 
A) = a Be 
Ss: tx-e [eel 


Now, the Cantor set is a subset of K;, and note that f/; maps the Cantor set onto 
itself. Restricting f| to the Cantor set, we get a map g, which is a homeomorphic 
permutation of the Cantor set. 

In view of the identification of the Cantor set with 2N, this homeomorphic 
permutation g; of the Cantor set admits a simpler definition in terms of elements 
of 2%: 


21((b1, bo, b3,..-,Bns.-.)) = (1 —b1, ba, 3, ..- Base), 


or more informally by saying that the permutation g, of 2N onto 2N transforms a 
binary sequence into another one by “flipping the first bit of the sequence.” 
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Consider again the set K, := [0, 5] U 5. ] U [;. ] U (8, 1] found in the 
second stage of the construction of the Cantor set. The intervals [0, 5] and [5. | are 
translates of each other and can be interchanged via these translations, and similarly 
the intervals (3, 3] and (8, 1] can be interchanged via similar translations to yield a 


homeomorphic permutation fy of Ko: 
x+% ifx € [0,5] U [§. 2], 
ho) = a 21 8 
X— 5% if % S-[5r5) 1; 1: 
Once again we can restrict f) to the Cantor set to obtain a homeomorphic 
permutation g of the Cantor set, and in view of the identification of the Cantor 


set with 2, the homeomorphic permutation g> can be defined in terms of elements 
of 2N as: 


Sa Disb: biclecbus)s )) = Oil Up Desc Baeai ls 


or more informally by saying that the permutation gz of 2N onto 2N transforms a 
binary sequence into another one by “flipping the second bit of the sequence.” 

In general, for each n € N, the operation of “flipping the n-th bit of an infinite 
binary sequence” gives a a homeomorphic permutation g,, of the Cantor set. 

Even more generally, for each subset A of N, we can define the map g4:2N —> 2N 
by setting 


ga((bi,b2,...,0n,...)) = ( Se ene ees 


where 


y Jind ifmed, 
ii bn otherwise. 


Problem 1102. Prove that for each A C N the mapping ga defined above is a 
homeomorphic permutation of the Cantor set. 


[Hint: Using the ternary expansion representation for the elements of the Cantor set 
may help. Also note that g7' = g4, and so it suffices to show that g 4 is continuous. ] 


Endpoints and Internal Points of the Cantor Set 


Consider the closed intervals obtained in the formation of the Cantor set, namely 


Oral: “WOnghe. Teel WOSh> Ugrsly Wevele lath 


The endpoints of these intervals 
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will be called the endpoints of the Cantor set. All other points of the Cantor set will 
be called the internal points of the Cantor set. Note that the internal points of the 
Cantor set are precisely its two-sided limit points, while the endpoints of the Cantor 
set are its one-sided limit points. (More generally, for any dense-in-itself set A, we 
can define the endpoints of A to be the one-sided limit points of A, and the internal 
points of A to be the two-sided limit points of A.) 


Problem 1103. Show that 


1. 1/4 is an internal point of the Cantor set. 
2. Under the identification of the Cantor set with 2N, an infinite binary sequence is 
an endpoint of the Cantor set if and only if it is eventually constant. 


The following technical result will be needed in the proof of Sierpinski’s theorem: 


Theorem 1104. Let E be the set of endpoints of the Cantor set K. Given any 
countable subset C of K, there is a homeomorphic permutation f of K such that 
F(x) is an internal point of K for every x € C, that is, such that f[(C] NE = @. 
Proof. We use the identification of the Cantor set with 2N, and work exclusively 
in 2N, 

Fix a bijection k:N x NN. 

Now let C = {x x®,...,x,...} be a countable subset of 2%, and define 
A CNby: 


A:= {k(m,2n —1)| m,n € Nand x (k(m, 2n — 1)) = 03 
U {k(m, 2n)| m,n € Nand x (k(m, 2n)) = 13, 


and put f = gy as above. In other words, f:2N — 2N is defined by setting 
F((b1, b2,...,6n,...)) = (DpcDy sive Oh gees ls 
where 


pz \l—bn ine A, 


bn otherwise. 


Put y" = f(x”), Then for each m € N we have y"")(k(m,2n — 1)) = 1 and 
y™ (k(m,2n)) = 0 for all n. Since k is injective, it follows that y"(j) = 1 for 
infinitely many j and y”)(j) = 0 for infinitely many 7. Hence y"") = f(x) is 
an internal point of 2N for each m € N. oO 
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17.3 Sierpinski’s Theorem 


We begin by the following immediate consequence of Cantor’s theorem on count- 
able dense orders. 


Proposition 1105. Any countable dense subset of R is homeomorphic to Q. 


Proof. This follows since if A is a countable dense subset of R, then by Cantor’s 
theorem we can get an order isomorphism f between A and Q. But both in A and 
in Q, every point is a two-sided limit point, and so both sets are continuously order- 
embedded in R. Hence f is a homeomorphism. Oo 


Now given any countable set A C R, we can take A U Q to get a countable dense 
subset of R which contains A. Hence we have: 


Corollary 1106. Any countable subset of R is homeomorphic to a subset of Q. 


Proposition 1107. If A is a dense subset of the Cantor set consisting only of 
internal points of the Cantor set, then every point of A is a two-sided limit point 
of A (in R) and hence A is continuously order-embedded in R. 


Corollary 1108. If C is a countable subset of the Cantor set consisting only of 
internal points and C is dense in the Cantor set, then C is a continuously order- 
embedded subset of R of order type n, and so C is homeomorphic to Q. 


Theorem 1109. If C is a countable subset of the Cantor set which is dense in it, 
then C is homeomorphic to Q. 


Proof. By Theorem 1104 there is a homeomorphic permutation of the Cantor set 
mapping C to a subset D of the Cantor set consisting only of its internal points. 
Then D is a countable dense subset of the Cantor set consisting only of its internal 
points, and hence homeomorphic to Q by Corollary 1108. oO 


Problem 1110. The set of endpoints of the Cantor set is a countable dense subset 
of it, and hence is homeomorphic to Q. 


Theorem 1111 (Sierpinski). Every countable dense-in-itself subset of R is home- 
omorphic to Q. 


Proof. Let A be a countable dense-in-itself subset of R. Then A is homeomorphic 
to a subset of Q. Since the Cantor set has subsets homeomorphic to Q, it follows 
that A is homeomorphic to a subset B of the Cantor set. B is then a dense-in-itself 
subset of the Cantor set, and hence its closure B is a perfect subset of the Cantor set, 
and hence B is perfect bounded nowhere dense set. By Brouwer’s Theorem, B is 
homeomorphic to the Cantor set and hence B is homeomorphic to a countable dense 
subset C of the Cantor set. By Theorem 1109, C is homeomorphic to Q. Therefore 
B, and hence A, is homeomorphic to Q. oO 


Problem 1112. Show that1+n,1+7+ 1, andn+2+ 7 are all homeomorphic 
to Nn. 
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Problem 1113. Prove that 2n is homeomorphic to n. 


[Hint: The set of endpoints of the Cantor set other than 0 and 1 has order type 27 
and is continuously order-embedded in R.] 


17.4 Brouwer’s and Sierpinski’s Theorems in General Spaces 


Although in this text we are restricting ourselves to subsets of R and will not 
define topological spaces or even metric spaces, let us point out that both Brouwer’s 
Theorem and Sierpinski’s Theorem can be stated in much more general settings. 

The general statement of Brouwer’s Theorem for metric spaces says: Any totally 
disconnected perfect compact metric space is homeomorphic to the Cantor set. 
Since any compact totally disconnected metric space is a zero-dimensional sepa- 
rable metric space and since any such space can be homeomorphically embedded in 
R, the general version of Brouwer’s Theorem follows from our special version for 
subsets of R. 

Sierpinski’s Theorem for general metric spaces says: Any countable metric space 
without isolated points is homeomorphic to Q. Again, since any countable metric 
space is a zero-dimensional separable metric space and since any such space can 
be homeomorphically embedded in R, the general version of Sierpinski’s Theorem 
follows from our special version for subsets of R. 


Chapter 18 
Borel and Analytic Sets 


Abstract This chapter covers some of the basic theory of Borel and Analytic Sets 
in the context of the real line. We define analytic sets using the Suslin operation, 
and show that they have all the regularity properties (measurability, Baire property, 
perfect set property), and therefore satisfy the continuum hypothesis—the best result 
possible without additional axioms. Along the way we obtain the Lusin Separation 
Theorem, Suslin’s theorem, the boundedness theorem, and an example of a non- 
Borel analytic set. 


18.1 Sigma-Algebras and Borel Sets 


Definition 1114. A nonempty collection S of subsets of R is called a Sigma- 
Algebra if 


1. S is closed under taking complements: if A € S then RNA € S; and 
2. S is closed under countable unions: if A, € S for alln € N then U, A, € S. 


Trivially, the two-element family {@, R} and the power set P(R) are sigma-algebras. 
More importantly, we have: The collection L of Lebesgue measurable sets is a 
sigma-algebra, and so is the collection Y of sets with Baire property. 


Problem 1115. Show that the collection S below is a sigma algebra: 

S := {A € P(R)| Either A or Rw A is countable}, 
and that S is the smallest sigma-algebra containing all singletons of R. 
Problem 1116. Show that if S is a sigma-algebra then 


1@€S,RES. 
2. Sis closed under countable intersections: If A, € S foralln € N thenNy,An € S. 
3. S$ is closed under finite unions, finite intersections, and set differences. 
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Problem 1117. Show that the intersection of any nonempty family of sigma- 
algebras is a sigma-algebra. Deduce that given any family C of subsets of R, there 
is a smallest sigma-algebra containing C. 


Definition 1118. Given a family C of subsets of R the smallest sigma-algebra 
containing C is called the sigma-algebra generated by C. 


Problem 1119. Show that 


1. The sigma-algebra generated by the measure zero sets together with the open 
sets equals the collection L of all measurable sets. 

2. The sigma-algebra generated by the meager sets together with the open sets 
equals the collection Y of all sets with Baire property. 


Definition 1120 (Borel Sets). B denotes the sigma-algebra generated by the open 
sets, and sets in B are called the Borel sets. 


Being a sigma-algebra, B includes, along with open sets, all closed sets, F, sets, Gs 
sets, and so on. Also, since L is a sigma-algebra containing all open sets, we have 
B CL, that is, all Borel sets are Lebesgue measurable. Similarly, B C Y, so all 
Borel sets have Baire property. Thus B CLI Y. 

Most effectively defined subsets of R normally encountered in analysis, including 
all examples we have seen so far, are Borel sets. 


Problem 1121. Let (A,| 1 €N) be a sequence of Borel sets. Show that the set 
{x | x € A, for all but finitely many n} is a Borel set. Similarly, the set {x | x € 
A, for infinitely many n} is a Borel set. 


Problem 1122. Let f,:R — R be a continuous function for eachn € N. Show that 
each of the following sets is Borel. 


I. {x € R| the sequence ( f,(x)| n € N) is increasing but bounded}. 
2. {x € R| lim, f, (x) = 0}. 
3. {x € R| lim, f, (x) exists }. 


Problem 1123. /f A, B € B and f: A > Ris continuous, then f—'[B] € B. 


For a collection C of sets, we let C, denote the collection of all countable unions, 
and C; the collection of all countable intersections, of sets in C. This is consistent 
with our notations F, and Gs, where F is the collection of closed sets and G is the 
collection of open sets. Thus from F, and Gs we can go to 


Fos := (Fo)s and Ggq := (Gs)o, 


and so on through Fo3¢, Gsos, Foscs, Gsosc, etc. Clearly all these collection of sets 
consist only of Borel sets and we have 


FCF, C Fog © Fogg C++ CB and G C G3 © Gg € Gggs C++ CB. 
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The above collections (obtained by iterating the operations of countable union and 
intersection through finitely many steps) still do not exhaust the Borel sets, and the 
process can be continued into the transfinite through the ordinals. It can be shown 
that such iterations keep generating newer and newer Borel sets through all the 
countable ordinals until when it stops at iteration w,, giving precisely the collection 
of all Borel sets. 

The iterative definition of Borel sets above make them a class of effectively 
defined sets (Sect. 5.5). Roughly speaking, among Borel sets, we can distinguish 
between degrees of effectiveness (or complexity). The open and closed sets are the 
simplest and most effective kinds of Borel sets. Sets which are F, or G; (but neither 
open nor closed) are at the next degree of effectiveness, and so on. 


Proposition 1124. [fC is a collection of subsets of R containing the open sets and 
closed under both countable unions and countable intersections, then C contains 
all Borel sets. 


Proof. Since C contains all open sets and is closed under countable intersections, 
therefore C contains all Gs sets and hence all closed sets. 

Now let A be the collection all subsets of EF of R such that both E and Rw E are 
in C. Since C contains both all the open sets and all the closed sets, it follows that 
A contains all open sets. It is readily verified that A is closed under complements 
and under countable unions. Hence A is a sigma-algebra containing the open sets 
and so B C A. Therefore B € C. oO 


Corollary 1125. The collection of Borel sets is the smallest collection containing 
the open sets and closed under both countable unions and countable intersections. 


Problem 1126. An infinite sigma-algebra S contains at least ¢ many sets. 


[Hint: Fix distinct A,, Az,--- € S. For E CN, put Be := UnezAn\ Unger An. Then 
infinitely many of the (pairwise disjoint) sets Bg € S are non-empty.] 


Problem 1127. For f:R > R, {a € R| lim f(x) exists} is a Borel set. 
[Hint: By Problem 1037 the set of points of continuity of f is a Gs set, and by 
Problem 985 the set of points of removable discontinuity of f is countable. ] 


Problem 1128. Let f:R — R be continuous. Show that the set of points H := 
{a € R| f’(a) = 0} at which the derivative of f vanishes is Borel. 

[Hint:a ¢ H + Ap € QTV6 € Qtar € Q(O < |r —a| < 5 and | Lo | > Pp); 
and quantifiers ranging over countable sets can be “converted into” countable unions 
and intersections. ] 


Problem 1129. Let f:R — R be continuous. Then the set of points D := {x € 
R|_ f'(x) exists} at which the derivative of f exists is Borel. 


324 18 Borel and Analytic Sets 
18.2 Analytic Sets 


If f:R — R is a continuous function, then by the last problem (Problem 1129), 
the domain dom(f’) of its derivative f’ is a Borel set. However, the range ran(f’) 
of f’ in general may not be a Borel set. The great mathematician Lebesgue made 
a famous error thinking that such sets are Borel. Suslin, a young student of Lusin, 
caught Lebesgue’s error and introduced a larger class of naturally and effectively 
defined sets which include sets such as ran(f’) (where f is continuous). This is 
the class of analytic sets, and Suslin used a special operation, now called the Suslin 
operation, to define such sets. In this section, we will define the Suslin operation 
and analytic sets. 


Review of Trees over N, Terminology and Notation 


N* is the set of all strings (finite sequences) from the set N = {1,2,3,...}. Then 
N* is a tree under the relation “wu is a prefix of v,” in which every node branches 
into infinitely many immediate extensions (Sect. 11.5). A portion of the tree N* is 
shown below. To simplify notation, we will often write a finite or infinite sequence 
(n1,12,3,...) simply as the string n;n2n3---. For example, the string “231” 
denotes the finite sequence (2, 3, 1). 


E 
1 2 3 
_ IT i a 
11 12 13... 
21 22 23 
—_— 7 ~~ 
231 232: 233: ss: 


Let us now recall and record the following basic definitions. 


If u = uyun---Um € N* and v = v1v2---v, € N* are finite strings of natural 
numbers, and x = XxX7---x,--- € NN is an infinite string of natural numbers, 
then: 


1. The number mm is called the length of u, denoted by len(u). 

2. The empty string is denoted by e, so that len(e) = 0. 

3. uw is an initial segment or prefix of v, or that v is an extension of u, if m < n and 
ug = Vx for all k < m. If, in addition, n = m+ 1, i.e., len(v) = len(u) + 1, then 
v 1s an immediate extension of u. 
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Nn 


. Similarly, uw is an initial segment or prefix of x, or that x is an extension of u, if 


uy = x; forall k < m. 


. Tf v = uyul2 +++ Un V1 V2~++* Vy 1s the concatenation of u and v. 
. For each r € N, we write u~r to denote the “immediate extension of u obtained 


by appending r,” that is u7r := uju2-+-Umr = ux (r). 


. x|m is the finite string x;x2---X, obtained by truncating x to its first m values 


(the unique finite sequence of length m which is a prefix of x). 


Definitions of trees, infinite branches, well-founded trees: 


. A subset T C N* is a tree if any prefix of any string in T is in T. 
. An infinite branch B is an infinite tree C N* which is linearly ordered: 


B= {e, uy, uyuz, Uyu2u3, ...} CN®*. 


As in Sect. 13.3, infinite branches are identified with elements of NN, with x € 
NN determining the infinite branch {x|n| n = 0,1,2,...}. 

. Atree T C N* is ill-founded if T contains an infinite branch. Otherwise T is 
well-founded. 

. If T C N* andu € N*, then 7™, the truncation of T at u, is defined as T™ := 
{v € N*| uv T}. Note that if T is a tree then so is T™. 


T,,: A well founded tree of rank w T,,4.,: A well founded tree of rank w+w 


Facts on well-founded trees (recall from Sect. 11.5): 


1. 


Atree T C N* is well-founded < the string extension relation D is well-founded 
on T, in which case rank(T) = rank of (T~{e}, D). If T 4 @, then rank(T) = 
pr (e), where pr is the canonical rank function on (T, D). 


. If T is a well-founded tree then so is 7, with rank(T™) < rank(T). If T“"”) 


is well-founded for all n € N, then so is T. 


. The rank of a well-founded tree T satisfies the following recursion: 


rank(T) = sup { rank(T“"))) +1|(n)€T,n EN} (supO:= 0). 
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4. For 0 < @ < aw, rank(T) = a © rank(T“))) < q@ for all n € N and for all 
& <a there is u € N*~\{e} with rank(T™) = & (Problem 854). 

5. Let T; and T> be trees and f:T,; — 7) be strictly increasing, i.e.,u GC v > 
St © fv). If To is well-founded then so is 7; with rank(7\) < rank(7>) 
(Problem 819). 

6. For each a < @, there is a well-founded tree T C N* with rank(T) = a 


(Problem 857). 


Problem 1130. Recall the Kleene—Brouwer ordering on N*: u <xp v <> either 
u lexicographically precedes v or u is a proper extension of v. Put r(€é) := 0 and 
r((m4,12,...,NK)) = ae + me feet TIRE Raga Then (1) r is a bijection 
between N* and the dyadic rationals in [0,1). (2) u <xga v @ r(u) > r(v). 
(3) (N*, <xg) is a linear order of type n + 1. (4) If T C N* is a tree, then T is 
well-founded <> T is well-ordered under <xg. 


The Suslin Operation 


Recall how a Cantor set is generated from a family of closed intervals (a Cantor 
system) indexed by the full binary tree: Each infinite branch through the binary tree 
determines a nested sequence of closed intervals whose intersection—the “branch 
intersection”—is a singleton, and then the Cantor set is obtained by taking the union 
of all such branch intersections. 

We now generalize the formation of the Cantor set to the case where arbitrary 
sets are used in place of the special closed intervals of a Cantor system, and where 
these sets are indexed by the infinitely branching tree N* (instead of the finitely 
branching binary tree). 

Let (E,,| u € N*) be a family of sets indexed by N*. Then given any infinite 
branch x = (Xn),en € NN, we can form the “branch intersection” 


os) 
() En = Ey, ial Fixx ial Ex xoxy Cheesy) Ex x9... %p (lies 


n=1 


The union of all such branch intersections (as x ranges over all possible infinite 
branches through N*) will be called the result of the Suslin Operation applied to the 
family (E,,| u € N*). More precisely, we have: 


Definition 1131 (The Suslin Operation). If (E,| u¢N*) is a family of sets 
indexed by nodes in the tree N*, then the result of the Suslin operation applied 
to (E,,| u € N*), denoted by 


A((E, | u€ N*)) = Ay Ex, 
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is defined to be the set 


foe) 
Ay Ey i= 'e () Ey in 


yeENN n=1 


= {x ER| (Ay E NY)(Vn EN)(x € Eyy,)}. 


Thus x € A, F,, if and only if there is a branch y; y2-++ yy, +++ € NN such that 


Co 
xeé () Ey yo y_- 


n=1 


The figure below shows a family (E,,| u € N*) of sets indexed by N* and the sets 
corresponding to the branch 3221--- = {e, 3, 32,322, 3221,...}: 


Note that the result A, F, of the Suslin operation performed on the family 
(E,,| u € N*) does not depend on the set E,. The following problems show that 
the Suslin operation is “more powerful” than the operations of countable union and 
countable intersection of sets. 


Problem 1132. Show that if C = US,C,, then C = A,E, for some family 
(E,,| u € N*) where each E,, = C,, for some n. 


[Hint: Put Fy.u--u, = Cu-] 
Problem 1133. Show that if C = N°,Cy, then C = A,E, for some family 


n=1 
(E,,| u € N*) where each E,, = C,, for some n. 


[Hint: Put E,, = Ceniu)-] 
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Definition 1134 (Suslin Systems). A Suslin system is a family of closed intervals 
(F,,| u € N*) indexed by N%, satisfying the following conditions: 


1. For each u € N*, F, is a closed interval in R, possibly unbounded or empty. 
2. F, > F, whenever u is prefix of v. 
3. For any infinite sequence of natural numbers x € NN, len(Fy),) > Oasn > oo. 


Note that we regard R and @ as closed intervals, with len(@) = 0. 


An analytic set is now defined to be one which can be obtained as the result of the 
Suslin operation applied to a Suslin system. 


Definition 1135 (Analytic Sets). A subset A of R is an analytic set if 
A=A,F, 


for some Suslin system (F,,| u € N*). 


Clearly, the notion of a Suslin system is a generalization of that of a Cantor system, 
but it differs from a Cantor system in two important ways: (a) The sets F, (u € 
N*) in a Suslin system are indexed by the infinitely-branching tree N* (instead of 
the finitely branching binary tree {0, 1}*), and (b) the sets F,, are arbitrary closed 
intervals (possibly unbounded or empty), and F,,~,, and F,-,,™m 4 n, are no longer 
required to be disjoint. (Problem 1145 below shows that both these requirements are 
necessary if the collection of analytic sets is going to be sufficiently comprehensive.) 


Problem 1136. Let (F,,| u € N*) be a Suslin system. Show that 


1. The set {u € N*| F, 4 Q} is tree over N (subtree of N*). 
2. For each x €R, the set {u € N*| x € F,} is also a tree overN. 
3. x €A,F, > the tree {u € N*| x € F,} is not well-founded. 


Problem 1137. Every closed interval, possibly unbounded, is an analytic set. 
Problem 1138. Any countable set is analytic. 
Problem 1139. All closed sets and all open sets are analytic. 


Theorem 1140. The collection of analytic sets is closed under the Suslin operation: 
If E,, is analytic for each u € N*, then A, E,, is analytic. 


Proof. Fix a bijection 2: N? — N satisfying 
m<m' =>xn(m,n) <a(m',n) and n<n' > x(m,n) < x(m,n’). 


Fix a function g: N > N such that for all, g(7) < n and there exist infinitely many 
m with g(m) = n. (The sequence 1, 1,2, 1,2,3,1,2,3,4,... is such a function.) 
Let h(n) := |< | k <nand g(k) = g(n)}|. 
Definek Kn & k <n, x(1, g(k)) <n, and m(g(k) +1, h(k)) <n. 
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Given w = w1W2-::W, € N*, define 


w Ww sow ifk <n, 
a(w, k) = m(1,1)/2(1,2) (1,g(k)) 
E otherwise, 


and 


Bw, k) = Wa(gk)+LIWa(gk) +12) Wag) tin) ik <n, 
€ otherwise. 


Now let £,, be analytic for each u € N*. We need to show that the set A,,E,, is 
analytic. For each u € N* there is a Suslin system (Fi lve N*) such that 


E, = AyF¢. 


Define a Suslin system (D,,| w € N*) as: 


= a(w,k) 
Dv= () Fook: 
k<len(w) 


It is routine to verify that this is indeed a Suslin system (details left for the reader.) 
We claim that 


AyEy = AyDy. 
Suppose first that a € A,£,. Then there is x = x1X2+++Xy_°+: € NN such that 


a € Ey, x,..x, for all n. Since Ey|, = A, F3"", so for each n there is an infinite 
sequence y") = yy 1Vy.2***Vn4-°** Of natural numbers such that 


CO oe) 
261) Foy = 1) BeBe 
k=1 k=1 
Now consider the infinite matrix 
X1 XQ +++ Xp 
Vid Yi2°** Vian 


Y21 Y2,2°°* Yan *°° 


and combine it into a single infinite sequence z)Z---z,--- using the pairing 
function z as follows: 


Zn(1,n) *= Xn and Zn(m+1n) += Vmwn- 


Then it is readily verified that a ¢ D., for any n, and soa € Ay Dy. 
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that a € D.,, for all n. Define infinite sequences x = x,X2-+++X,-+++ and y”™ = 
Yni¥n2°** (for each n) by setting 


Xn 2 Zx(1.n)-Ynym += 2a (n+1,m)+ 


We claim that a € E,), for all n. Fix any n. Since Ey|, = Ayr, so it suffices to 
show that for all m, 


ae ee. 
Enumerate {k | g(k) = n} as {k, < ky < ---}, so that g(k,,) = n and h(k,) = 1, 
h(kz) = 2, etc. For each m, fix any n,, such that ky K nm. Hence a(z\tm, kn) = 
X1XQ+++X_, = x|n for all m, and B(z|tm.km) = Yat¥n2°** Yam. Since a € Dz, 
and km K Nm, it follows that 


a(Z|tmskm) __ x|n 
es Fecin, key) ~ Pyke 


Thus a € A,E,. oO 


Corollary 1141. The collection of analytic sets is closed under countable unions 
and countable intersections. 


A set is called coanalytic if its complement is analytic. 
Corollary 1142. Every Borel set is both analytic and coanalytic. 


Proof. Since every closed interval is analytic and every open interval is a countable 
union of closed intervals, therefore every open interval is analytic. Since every open 
set is a countable union of open intervals, therefore every open set is analytic. Thus 
the collection of all analytic sets contains all open sets and is closed under countable 
unions and countable intersections. Hence by Proposition 1124 the collection of 
analytic sets contains all Borel sets. 

Now the complement of any Borel set is still Borel, and hence analytic. Therefore 
every Borel set is coanalytic as well. Oo 


In the next section, we will prove the converse of the above result. 


Problem 1143. There are exactly ¢ = 280 analytic sets. Hence each of the 
following families of sets has cardinality ¢ = 2°: The coanalytic sets, the Borel 
sets, the F, sets, the Gg sets, the closed sets, and the open sets. 


Problem 1144. There is a set which is both meager and of measure zero but neither 
analytic nor coanalytic. Conclude that there measurable sets and sets with Baire 
property which are not Borel, i.e, B G Land B € Y. 


Problem 1145. Let us say that a Suslin system (F,,| u € N*) is finitely branching 
if the tree {u € N* | F, 4 Q} is finitely branching, and that it is disjointed if 
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Fucm 1 Funm = © whenever m # n. Show that if (F,| u € N*) is either a finitely 
branching or a disjointed Suslin system, then 


ioe) 
Au F, = () 'e Fi ’ 


n=1 len(u)=n 


and hence the set A, F,, must be an Fgs set. 
(Hint: For the finitely branching case, use K6nig’s Infinity Lemma. ] 


Problem 1146. Show that if A is analytic and f:R — R is continuous, then 
f [A] and f [A] are analytic sets. 


[Hint: Problem 983 may help.] 

Remark. There are many characterizations of analytic sets that we will not cover. 
For example, it can be shown that A is analytic if and only if A = f[B] for some 
continuous f and Borel B (in fact B can be taken to be a G;). Earlier we mentioned 
that ran(f’) may not be Borel for a continuous f. A result of Poprougenko says 


that A is analytic < A = ran(f’) for some continuous f. See [38, 45, 46, 55, 64] 
for many other characterizations. 


18.3. The Lusin Separation Theorem 


Definition 1147. To each Suslin system (F,| u¢N*), we associate a family 
(Ro |ue N*) of sets indexed by N* by setting, for each v € N*: 


F) := {x € R| There is y € NN extending v such that x € Qn Fyn}- 


Thus x € F™ if and only if there is yj y2-++ yy --- € NN such that 


Uy U2*"Um 


co 
ue = ye forallk <m and xe () Foy, jy yq 
n=0 


Note that we have F“) = A, F,,, and writing v * u for the concatenation of v and u, 
we can express F\*) as F™) = Ay Fyxu. 

Problem 1148. /f (F,,| u € N*) is a Suslin system and v € N*, then: 

1. F® is analytic. 


[o.@) 
(CK) (®) 
oF =| ae 


n=1 
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Definition 1149. We say that C separates A from B if AC C andCNB=@. 
The sets A and B are called Borel separable if there exists a Borel set C separating 
A from B. 


Proposition 1150. Jf A = Un, Am, B = Un Bn, and Ay, and B,, are Borel separable 
forallm,n €N, then A and B are Borel separable. 


Proof. For each m,n fix a Borel set C,,,, separating A,, from B,. Put 


= () Gas (for each m), and Be Ding 


n m 


Then the set E is Borel. Now note that for each m we have Am C Dy» and Dy A 
B=@. Hence A C FE and EN B =@,s0 E separates A from B. oO 


The following result is called the Lusin Separation Theorem. 


Theorem 1151 (Lusin). /f A and B are analytic sets which are not Borel separa- 
ble, then AN B # @. Hence disjoint analytic sets are Borel separable. 


Proof. Suppose that A = A, F, and B = A,F, are analytic sets which are not 
Borel separable, where (E,,| u € N*) and (F,| v € N*) are Suslin systems. Since 


A=EO=|JE 


m n 


and B= FO =| )FS, 


by the last proposition there must exist m, and n, such that E ee and F ‘n i are not 
Borel separable. Again, since 


(*) (*) in) (-) 
E®, = =O Frm my and Fe =U Fina 


there are m> and n> such that oe >) and F* ba ny) are not Borel separable. 
Continuing the process, we get two infinite sequences (71, ™m2,...,™m™x,...) and 
(11,2,...,Mz,...) such that for every k, 
EY and F”? are not Borel separable. 
(m4 ,m2,...,Mk) (11,N2,...1k) 


But if the above two sets are not Borel separable, then for every k, the closed 


intervals I, := Emm wmy,..amy) ANd Je 2= Fay ny,....ny) Cannot be euoint since 
if we had I, MN Jeg = @ then the Borel set 7; would separate Eun eee 
from re Thus J; and J; are nested sequences of closed intervals with 


I. 0 Jp. 4 Gand len(J;,) — 0 and len(J;,) > 0 as n — oo, and so 


(\ie ={p} and () he = {q} 
k k 
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with p € Aandgq € B. Nowif wehad p ¥ q, we could choose k sufficiently large 
so that len(/;), len(Jk) < |p—q|/2, implying I, NJ; = @, which is a contradiction. 
Hence p=gandsoAnNB#@. Oo 


Corollary 1152 (Suslin’s Theorem). A set is Borel if and only if it is both analytic 
and coanalytic. 


In Sect. 18.6 we will explicitly define a non-Borel analytic set. 


18.4 Measurability and Baire Property of Analytic Sets 


Definition 1153. An Ulam matrix is a family of sets (E” | n © N, a < a1) with Xo 
rows and &; columns as shown below such that: 


1. The sets in each row are pairwise disjoint: Ey 1 Ez = © fora # B. 
2. The union of the sets in each column contains any set to its right in the first row: 
Unen £2 > Ep for any B > a. 


a 
> 
Et E|..- E} he ies 
2 p2 2 
An Ulam Matrix: Fy Ey es Bg re 
EE’, néeN,a<a Mle 3 
ER Et 1. EM eceees 


+ 


Note. The ordering of the rows really does not matter, so the rows could be indexed 
by any countable set instead of N so long as one fixed row is designated as “the 
first row.” 

To prove that analytic sets are Lebesgue measurable and have the Baire property, 
we will use the following result, which is also of considerable interest in itself as it 
provides coanalytic sets with “nice ordinal ranks.” 


Theorem 1154 (Ulam Matrix Decomposition of Coanalytic Sets). For any coan- 
alytic set C there is an Ulam matrix of Borel sets whose first row has union C. 

More specifically, there is a family of Borel sets (Ci | u € N*,a < @) satisfying 
the following, where we use the abbreviation Cy := Cy: 


1. For each u € N* and all ordinals a, B < a, CN Cz = Oifa sk Bp. 
2. If B > a and x € Cg thenx € CY for some u € N*. 
3.C= eee Cy, where by (1) the sets Cy, < @, are pairwise disjoint. 


Proof. Let (F,| u € N*) be a Suslin system with RNC = A, F;,. For each x € R 
and u € N%*, let 
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T,:={veN*|xeR}, and T®:= WeN*|x€ Fay}, 


From the definition of a Suslin system, 7), and T are trees over N with 7, = Te. 
Also, we have x € RNC © x € A, F, > there is an infinite branch through 7). = 
T, 1s not well-founded. Thus we have: 


Forallx ER, xEC S T, is well-founded. 
By Problem 854 (Fact 2 on page 325), it follows that: 
If x € C, then Tw is well-founded and rank(T) < rank(7},). 
Now define the set C,/, for each u € N* and ordinal a < @, by the condition: 
x€C* & T is well-founded of ranka (x € R), 


and put, as in the statement of the theorem, Cy := C/. 

Condition (1) of the theorem is now immediate. 

Condition (2) follows from Problem 854 (Fact 4, page 325). 

Now, note that x € Cy = T, is well-founded of rank a, so x € U 
is well-founded, hence condition (3) of the theorem follows. 

It remains to show that Cy’ is Borel. We prove this by transfinite induction on a 
for the statement “C;' is Borel for all wu € N*.” Recall that the only trees of rank 0 
are the empty tree and the singleton tree {¢} consisting of the root node ¢ alone, i.e., 
T has rank 0 @ T C {e}. 

For a = 0, note that x € Cj @ T has rank 0 © T!? C {e} & x ¢ F, or 
x € F,NU,, Fien, 80 Co = (RSF) U (FuXU,, Fucn), which is Borel. 

For a > 0, assume that for every § < a, C/‘ is Borel for all u € N* (induction 
hypothesis). Then, by Problem 854 again (Fact 4, page 325): 


Cu ? Tx 


aA<@) 


x €C* & T has rank a 
Wn EN, rank(T“”) <a, and VE < @ Av € N* rank(T"*”) = & 
<> Wn € NE <a rank(T ")) = &, and 
VE <a av € N* rank(T\“*”) = &. 


Writing the above in terms of set unions and intersections: 


a=[NUe"]N[NU ee"), 


n€NE<a E<a vEeN* 


which by induction hypothesis is Borel since all the unions and intersections 
involved are countable unions and intersections. Oo 
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Corollary 1155. Every coanalytic set is the union &,-many Borel sets. 


Definition 1156. A sigma algebra S containing a o-ideal Z is said to be CCC 
modulo Z if for any family (A; | i € 1) of pairwise disjoint sets in S, we have A; € Z 
for all but countably manyi € /. 


Natural examples of sigma algebras which are CCC (modulo a o-ideal) are: 


1. The sigma algebra L of Lebesgue measurable sets is CCC modulo the o-ideal of 
measure zero sets (Corollary 1029.6). 

2. The sigma algebra Y of sets with Baire property is CCC modulo the o-ideal of 
meager sets (Corollary 1062). 


Theorem 1157. Let S be a sigma algebra containing all Borel sets and Z be a o- 
ideal contained in S such that $ is CCC modulo Z. Then every coanalytic set (and 
so every analytic set) is in S. 


Proof. Let C be a coanalytic set. By Theorem 1154, fix an “Ulam matrix” of Borel 
sets (C,/| u € N*,a@ < @;) such that, using the abbreviation Cy := CZ, 


1. For each u € N* and all ordinals a, B < a, CN C3 = Oifa # Bp. 
2. IfB > a and x € Cg then x € C} for some u € N*. 


3.6 = yew, Cox 


Since each C* is Borel, so C}' € S. For each u € N*, the family (C;'| a < @) is 
pairwise disjoint by condition (1), and since S is CCC modulo Z, so there exists 
Qy, < @, such that C B € Z for all 6 > a,. Since there are only countably many 
u € N*, we can fix some @ < @ witha > a, for all u ¢ N*. Then CY € Z for 
all u € N*, and so the countable union L),cjx C# is in Z. By (2), Unenx C4 2 
Ussa Cp, 80 Upsg Cp is in Z C S. Also, Ugeg Cp is Borel and so is in S. Hence 


C= (Upes Cp) U (Ussa Cp) is in S. gO 


Corollary 1158. All analytic sets and all coanalytic sets are Lebesgue measurable 
and have Baire property. 


Corollary 1159. No Vitali or Bernstein set is analytic or coanalytic. 


Remark. Ulam matrices are useful in showing that certain uncountable unions of 
measure zero sets can still have measure zero. Ulam used them to show that no 
nontrivial measure can be defined on a set whose cardinality is a successor cardinal 
(like &), No, etc). Ulam’s proof is given in Theorem 1196, and it may be instructive 
to compare it with the above proof. 


18.5 The Perfect Set Property for Analytic Sets 


Theorem 1160. Every uncountable analytic set contains a perfect set and hence 
has cardinality ¢. 
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Proof. The proof of the theorem will be a variant of the proof of the corresponding 
theorem for dense-in-itself Gs sets, but note that we cannot directly copy that proof 
since a dense-in-itself analytic set may be countable. Let 


A=A,F, 


be an uncountable analytic set, where (F,,| u € N*) is a Suslin system. The heart of 
the proof of the theorem is in the following lemma. 


Lemma 1161. For each u € N* and > 0, if F\ is uncountable then there exist v 
and w in N* extending u such that F, and Fy are disjoint nonempty closed intervals 
of length < 6 with both F and F uncountable. 


Proof (of Lemma). Since F“) is uncountable, and by Theorem 970 all but countably 
points of F\“) are condensation points, we can pick two distinct condensation points 
p<qinF™. Letr = min (F(q — p), 6), and put 


L:=F“O(p-r,pt+r) and U:=F™O(q-r,q+r) 
so that L and U are disjoint uncountable subsets of F\*). Finally, put 


S := {v| vextends u, len(F(v)) <r, F(v) NL AO}, and 
T := {w| wextends u, len(F(w)) <r, Fw) NU F @}. 


We claim that 


EG| JE. 


ves 


To see this, let x € L and pick z € NN extending u such that x € NF (z|m). Fix a 
large enough m to make len(F'(z|m)) < r and m > len(u), and put v = z|m. Then 
x € F(v) OL, len(F(v)) <r, and v extends u, so v € S, and hence x € F\*. Thus 
the claim is established. 

Since L is uncountable, it follows that Ff“) is uncountable for some v € S. 

Similarly, F) is uncountable for some w € T. 

For such v and w, F(v) and F(r) are closed intervals of length < r. But since 
inf U — sup L > 2r and F(v) L and F(w) NU are both nonempty, it follows that 
F(v) and F(w) must be disjoint. Oo 


Note that the v and w of the lemma must be proper extensions of u. 

To continue with the proof of the theorem, we repeatedly apply the lemma to 
build a Cantor system by associating with each binary string u € {0,1}* a string 
t(u) € N* such that for all u,u’ € {0, 1}*: 
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1. F,;), is uncountable and len(F,(w) < ESe 
2. If uv’ properly extends u then t(u’) properly extends ¢(u). 


3. Fraro) ON Frac = @. 


To do this, note that since A = F<) is uncountable we can use the lemma to choose 
vo with len(F,,) < 1 and Eo uncountable, and define t(¢) := vo. Then, having 
defined t(u) for u € {0, 1}* with F{()) uncountable, we can use the lemma to choose 
v and w properly extending ¢(u) such that F, and F,, are disjoint intervals of length 
< 1/(en(u) + 2) with both F and F“ uncountable, and define t(u~0) := v and 
t(u~1):=w. 

This makes the family (Fra) | u € {0, 1}*) a Cantor system. Hence if ¢(z) 
denotes the unique member of M, Fin), then g: {0, 11N —. R is an injective 
mapping whose range ran(¢@) is a generalized Cantor set. Also if z € {0, 1}, then 
t(z|1), t(z|2), t(z|3), ... form an infinite sequence of members of N* where each 
term properly extends all the preceding ones, and so they define a unique y € NN 
such that y|n is a prefix of f(z|n) for each n. Hence g(z) € On Fiiny S Nn Fyn © A. 
Thus ran(y) C A, and therefore A contains the generalized Cantor set ran(y) (which 
is perfect and of cardinality c). oO 


This theorem was the final achievement in the classical program of showing that 
a class of effectively defined sets has the perfect set property and therefore the 
Continuum Hypothesis holds when restricted to that class of sets. There were early 
attempts to incrementally extend such restricted forms of CH to larger and larger 
collections of sets of reals—by showing that the collection in question has the 
perfect set property. Of course, Bernstein showed that there are sets which do not 
have the perfect set property, but such sets are not effectively defined, and so one 
could still hope for larger collections of effectively defined sets to possess the perfect 
set property. 

The earliest major result along this line was the Cantor—Bendixson theorem 
(Corollary 973): The class of closed sets has the perfect set property. Alexandrov 
and Hausdorff extended the result to the class of Borel sets. 

The last theorem says that the collection of analytic sets (which includes the 
Borel sets) has the perfect set property. This is essentially the best result that can be 
proved using the usual axioms of set theory, since without additional set-theoretic 
assumptions it cannot be proved that the coanalytic sets have the perfect set property. 
The classical program of extending the perfect set property for effectively defined 
sets thus came to a stall, and mathematicians such as Lusin realized that a limit has 
been reached. 


Regularity Properties of Analytic Sets 


The perfect set property is a regularity property of a set. Combining Corollary 1158 
and Theorem 1160, we get the following classical result. 
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Theorem 1162 (Regularity Properties of Analytic Sets). Every analytic set is 
measurable, has the Baire property, and the perfect set property. 


18.6 A Non-Borel Analytic Set 


Coding Subsets of N* by Elements of the Cantor Set 


We first need a special effective one-to-one enumeration of the set N* of all finite 
strings of natural numbers. 


Proposition 1163. There is an enumeration of N* without repetitions 
N* = {u), Uo, .-.,Uy,,Uy+]1,... } 


such that if Uy, is a proper initial segment of U, thenm <n. 


Proof. Let p, denote the n-th prime, so that pj = 2, p2 = 3, etc. 

First take u; to be the empty sequence, i.e., put uy := &. 

Next, for each n > 1 let k be the largest integer such that p,; | n. Then n can be 
written as 


ni—l _no-1 Ne—1—1 nk 


1 
NM=P, Py *** Pry PR 

for a unique sequence of k positive integers 11,12,...,n« € N. Now define u, := 

NjN2++*Ng. 

It is now readily verified that the strings u;, U2,... form a one-to-one enumera- 
tion of N* and satisfy the condition of the proposition. Oo 
We now fix, once and for all, an enumeration u),Uo,...,U,,... Of N* as in the 
above proposition: 

Definition 1164. u;,uo,...,u,,... will denote the specific sequence of strings of 


Proposition 1163 which enumerate N* without repetitions and satisfy the condition: 
If u,, is a proper initial segment of u,,, then m <n. 

As usual (review Sect. 6.6), we identify each point x in the Cantor set K C R 
3 7 F 5 7 fi 2Xn 
with the (unique) infinite binary sequences (x,) in {0, 1}N such that x = °°, a 
Conversely, any (x,) € {0,1}N is mapped to x = )°°°, an For x € K and 
(xn) € {0, 1}§, we will use the notation “x ~ (x,)” to express this identification, 


wigs too. = CO 2x» 
i.e., as an abbreviation for “x = )) 7, 3 


Definition 1165. For (x,) € {0,1}‘ and x € K, we write 
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co 


x~(xX,) > >? 


n=1 


2Xn 
3” 


(x € K, (x,) € £0, 1}9). 


Definition 1166 (Coding subsets of N* by points of the Cantor set). For each x 
in the Cantor set K with x ~ (x,) € {0, 1}, define the subset U(x) C N* by: 


U(x) := {un | Xn = 1. 


We say that U(x) is the subset of N* coded by x. More generally, if x is a member 
of the Cantor set K and A C N* then we say that x codes A (or x is a code for A) 
if A = U(x). 


So for x € Kand A C N*, we have: x codes A & U(x) = A, and thus we have a 
natural bijection between the Cantor set K and P(N*), the power set of N*, via the 
one-to-one correspondence: 


x <> U(x) (x €K, U(x) CN%*). 
Now consider the set of x € K for which U(x) € N* is a tree over N: 
Problem 1167. Show that the set {x € K| U(x) is a tree} is closed, i.e., the set of 
those members of the Cantor set which code trees is a closed set. 
Problem 1168. Show that the following subsets of the Cantor set are Borel: 


1. The set of codes for trees in which every node has a proper extension. 

2. The set of codes for trees in which every node has at least two immediate 
extensions. 

3. The set of codes for well-founded trees having rank < 2. 

4. The set of codes for well-founded trees of finite rank. 

5. The set of codes for well-founded trees of rank < a, where a < @}. 


[Hint: Reviewing the proof of Theorem 1154 may help.] 


Definition 1169. WF denotes the set of codes for well-founded trees, IF denotes 
the set of codes for ill-founded (non-well-founded) trees, and WF, denotes the set 
of codes of well-founded trees having rank < @. 


In other words, if K denotes the Cantor set then 


WE = {x € K| U(x) is a well-founded tree}, 
IF = {x € K| U(x) is an ill-founded tree}, and 
WE, = {x € WF| rank(U(x)) < aq}. 
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Recall that every countable well-founded tree has countable rank, so that 


WF = |) We. 


aA<@] 


Moreover, since for each countable ordinal a there is a countable tree of rank > a, 
it follows that WF~ WE, is nonempty for all a < @;. Also, we saw in the previous 
problem that WF, is a Borel set for each countable ordinal a. 


Definition 1170. We say that a Suslin system (S,,| u € N*) dominates WF if for 
all x € WF, the set {u € N*| x € S,} is a well-founded tree of rank > rank(U(x)). 
In other words, (S,,| u € N*) dominates WF if for any x € WF 4;~WEFy, {u € 
N*| x € S,} is a well-founded tree of rank > a. 


Note that if a Suslin system (S,,| u €¢ N*) dominates WF, then the analytic set A,,S, 
generated by it must be a subset of IF (where IF = the set of codes for ill-founded 
trees). We now show that A,,S, can actually be equal to IF for a suitably chosen 
Suslin system (S,| u € N*) dominating WF, i.e., we can have IF = A,,S,, for some 
(S,| u € N*) which dominates WF. 


Proposition 1171. The set IF of codes for ill-founded trees is an analytic set. 
Moreover, there is a Suslin system (S,| u € N*) which dominates WF and generates 
IF (so that IF = A, 8,,). 


Proof. To simplify notation, a finite sequence (7),72,...,”%) will be denoted 
simply by the string n;n2---n;. Note the mapping 


(i,j) te 2j-i 
is a natural effective bijection from {0, 1} x N onto N, and its inverse is the mapping 
n +> (n,n), 
where we are writing, for any n € N: 
n:= |(nt+1)/2], and n:=2n—n. 
The above bijection also induces other bijections. For example, the mapping 
MNg*+Ne +> (M1N2+++N~, Nyy +++ Ax) 

is an effective bijection from N* onto {(u,v) € {0,1}* x N* | len(w) = len(v)}. 
Similarly, it also induces a natural effective bijection between NN and {0, 1}N x NN. 
In this last bijection, an element of (y,) € NN can be thought to be coding the pair 
of sequences ((V,),(%n)), where the “left sequence” (j,) € {0, 1}N is an infinite 


binary sequence which can be used to code a tree, while the “right sequence” (J,) € 
NN can be used code an infinite branch through that tree. 
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Let (J[u]| u € {0,1}*) be the classical Cantor system that was defined in 
Sect. 6.6 as: 


I [uO] = left-third of I [uv], 


T{e] := [0,1], and for all u € {0, 1}*: 
Z[u~1] = right-third of 7 [u]. 


Thus for a real x in the Cantor set K with x ~ (x,), we have x € I[nin2---n,] if 
and only if x; =n, for 7 = 1,2,...,k. 
Now define the required Suslin system S = (S,| u € N*) as follows: 


Q 


if di, 7 <k (u; Cu; A Nj =O0An; = 1), 
Sning ng = 1@ if di, 7 <k (u; = Anz +++; Ani = 0), 


I[f\n2---nx] otherwise, 


where u,,, 1 = 1,2,3,..., is the enumeration of N* as in Definition 1164. 


Note that if (nk)pen € NN and x € ‘ar Sningn,, then x is a member of the 
Cantor set with x ~ (7x); en, SO x codes a tree (by the first clause of the definition 
of S), and 7/2 --- fix +++ is an infinite branch through the tree U(x) coded by x (by 
the second clause). Conversely, if x ~ (xx);,en is in IF then the set U(x) coded 
by x is a tree containing some infinite branch, say mym2---m,---, which implies 
xe ‘an Sningn, Where nx := 2m, — xx. It follows that IF = A, Sy. 

Next, suppose that x ~ (x,) codes a well-founded tree, i.e., x € WF. For each 
u = mym2---m, € U(x), define f(u) := nin2---ng where 


nj =2m;—X;, j =1,2,...,k. 


For each u € U(x), we have x € Sy, and so f is a function from U(x) to 
{v| x € S,}, ie. f:U() > {v| x € S,}. Also f is strictly increasing (i.e., if 
u’ is a proper extension of u then f(u’) is a proper extension of f(u)), and so by 
Problem 819 the rank of U(x) is at most the rank of {v| x € S,}. Oo 


Corollary 1172. WF is a coanalytic set. 


The Boundedness Theorem 


Theorem 1173. If B C WF is analytic then B C WF, for some a < a. 


Proof. Let B C WF be analytic, and let (B,| u € N*) be a Suslin system with 
B=A,B,. Let (S,| u € N*) be a Suslin system as in above proposition: That is, 
(S,| u € N*) generates IF and dominates WF, so that for each x € WF, the rank 
of U(x) is at most that of the well-founded tree {u| x € S,}. Define a tree S@B on 
(N x N)* by the condition that 
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(m1, n1) (m2, n2) See (mk, nk) € SOB >? Sinjma-mpz ia) Bryno-ny x ©. 


Then S@B is well-founded since IFN B = @. Let @ be the rank of S ® B. Now for 
each x € B, we can fix an element (nx) € NN such that x € Bryny-n, for allk €N, 
and then define f,:{u | x € S,} > S@B by setting 


Fe (mimy2-+-my) = (m1, 11) (m2, N2) +++ (ME, Nk) - 


Then f;, is strictly increasing, so by Problem 819, rank({u | x € S,}) < a. But 
then rank(U(x)) < rank({u | x € S,}) < a, sox € WE,. EI 


If ~@ < @, then WF, must be a proper subset of WF, since by Problem 857 there 
are well-founded trees T C N* with rank(7) > a. Hence no analytic subset of WF 
can equal WF, and we have the following immediate corollary. 


Corollary 1174. WF is not analytic, and hence not Borel. Consequently, IF is an 
analytic set which is not Borel. 


Notice the effective nature of the proof that WF is not analytic: Each Suslin 
system (B,|u € N*) effectively determines the tree S ® B of the proof above, 
which we regard as a subset of N* (by identifying (N x N)* with N* via any 
fixed bijection between N x N and N). Moreover, for each tree T C N* let 
TT := {e} U {(1,m1,m2,...,mk) | (1,n2,...,ne) € T}, so that TT is well- 
founded whenever T is well-founded, with rank(7*) = rank(T) + 1. Finally, let 
h((B,| u € N*)) be the code for the tree (S @ B)*. We thus have an effective 
function which assigns to each Suslin system (B,|u€N*) the real number 
h((B, | u € N*)) satisfying the following property: 


For every Suslin system (B,| u € N*), 


if A,B, C WF then h((B,| u€ N*)) € WENA, B,. 


Note also that the set WF, being a subset of the Cantor set, has measure zero and is 
nowhere dense, and therefore is Lebesgue measurable and has Baire property. Thus 
WE is an explicitly defined example of a Lebesgue measurable set which is not a 
Borel set. 


Corollary 1174 is a modern version of a result of Lusin. It was originally stated 
in terms of continued fractions, but can be reformulated as follows. 


Theorem 1175 (Lusin). Let L be the set of all x ~ (X,) in the Cantor set for 
which there are positive integers ny < nz <-+++ <n <---+ such that forallk EN, 
Xn, = land n, divides n+. Then L is analytic but not Borel. 


Problem 1176. Prove Theorem 1175. 


[Hint: First find an injection f:N* — N such that u is an initial prefix of v > f(u) 
divides f(v). Problems 996, 1102, and 1123 can then help.] 
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Problem 1177 (Borel Codes). Fix an enumeration (I, | n € N) of all open inter- 
vals with rational endpoints. For each well-founded tree T C N*, define the set 
B(T) CR by recursion on rank(T) as follows: 


Uth|(n) €T,n €N} — ifrank(T) < 1, 


B(T) := 
RX) {B(T“") | n EN} ifrank(T) > 1. 


Then E is Borel if and only if E = B(T) for some well-founded tree T © N*. 
Problem 1178 (=° and T°). Let B(T) be as above. For each a < wy, put: 


¥° := {B(T)| rank(T) <a}, and 19 := {RXE| E € 9}. 


Then be is the class of open sets, x the F, sets, b the Gsg sets, and so on. 
Similarly, ni! = closed, ne = G;, m3 = Fos, etc. Furthermore, we have: B = 
{B(T)| T C N* is a well-founded tree} = Vy, t= ere n°. 


One also defines Z} := the class of analytic sets, I} := the coanalytic sets, and 
Ai — =} a I} = sets which are both analytic and coanalytic. In this notation, 
Suslin’s theorem is expressed by the equation A! = B. 


Borel and Analytic Sets in More General Spaces* 


We have limited ourselves to R, but the concepts of Borel and analytic sets can 
be readily extended to the higher dimensional spaces R”. One can then show that 
A C Riis analytic if and only if A is the projection of a Borel set B C R’, ie., 
A= {x ER| (x, y) € B for some y} for some Borel B. 

In the higher dimensional spaces, one can obtain universal sets. For example, 
there is a universal open G © R®* such that every open subset U C R is a section 
of G,ie., U = {x | (x, y) € G} for some y. Such universal sets are available for 
every level =. a < q, of the Borel hierarchy. A simple Cantor diagonalization 
then shows that the Borel hierarchy “keeps producing new sets”: There are sets in 
each level which do not belong to any lower level. 

Similarly, there are universal analytic sets which easily produce non-Borel 
analytic sets. Our proof to produce such sets was much harder, but had the benefit 
of obtaining a highly effective form of the boundedness theorem. 

Separable complete metric spaces (Polish spaces) provide an even more general 
and natural setting for studying Borel and analytic sets. Some examples are the 
Baire space NN, the Cantor space {0,1}N, and the space C0, 1] of continuous 
functions on [0,1] under uniform convergence. For example, in C[0, 1], the set 
of everywhere differentiable functions is coanalytic but not Borel (Mazurkiewicz), 
and the set of functions which satisfy Rolle’s theorem is non-Borel analytic 
(Woodin). More examples occur in various areas such as analysis and topology. 
See [38, 45, 46, 55, 64]. 


Chapter 19 
Postscript II: Measurability and Projective Sets 


Abstract In this postscript, we describe two important classical problems of real 
analysis that could not be settled using the usual axioms of set theory: (1) The 
Measure Problem on extending Lebesgue measure to all of P(R), and (2) Lusin’s 
Problem on properties of PCA sets and projective sets. Ulam’s analysis of Problem 1 
(Measure Problem) led to large cardinals known as measurable cardinals, which, 
surprisingly enough, was shown by Solovay to have remarkable implications for 
Problem 2 (Lusin’s Problem) as well. The independence results mentioned here 
illustrate the prophetic nature of Lusin’s conviction that the problems of PCA and 
projective sets are unsolvable. This also sets up the background for Postscript IV 
which will describe how larger cardinals and determinacy essentially “solve” (!) 
Lusin’s Problem. 


19.1 The Measure Problem and Measurable Cardinals 


Banach and other Polish mathematicians investigated the question whether there is 
an extension of Lebesgue measure defined on all sets of reals. It is clear that the 
question remains equivalent if we replace R by [0, 1], so the problem can be stated 
as follows. 


The Measure Extension Problem (Lebesgue). Does there exist a countably 
additive set function js: P([0, 1] — [0, 1] defined on all of P((0, 1]) which extends 


Lebesgue measure? 


Banach and Kuratowski showed that if the Continuum Hypothesis (CH) holds, then 
the above measure extension problem has a negative answer. 


Problem 1179. Let 4: P(R) — [0, co] be set function which is countably additive 
and such that for any bounded interval I, the measure 1(1 ) equals the length of the 
interval I. Then show that tt extends Lebesgue measure. 
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We saw that Vitali sets and Bernstein sets cannot be Lebesgue measurable. The 
crucial property of measure used in the Vitali proof is translation invariance, and 
that in the Bernstein proof is outer regularity. Indeed: 


Theorem 1180 (Vitali and Bernstein). Let jz be any countably additive nonnega- 
tive set function defined on a sigma algebra S C P(R) containing all intervals and 
with (R) > 0. Then the following hold. 


1. If is translation invariant and bounded on the intervals (([a, b]) < 00 for all 
a <b), then no Vitali set is in S. 

2. If w({x}) = 0 for all x and yu is outer regular (for all E € S and € > 0 there is 
open G D E with u(G~E) < €), then any E € S with w(E) > 0 contains an 
uncountable closed set, and so no Bernstein set is in S. 


Proof. The proof of (1) is exactly same as the original proof for Lebesgue measure: 
Let V be a Vitali set, so that (V +7r)N(V +s) = @ for all rational r # s, 
and U,cg(V +1) = R. If V were jz-measurable (i.e., V ¢ S) then we have 
L(V) > 0, so there area < b with w(V A [a,b]) > 0. PutW := VN [a,b]. 
Then the (W +r|r€Q/N[0,1]) is a family of pairwise disjoint jz-measurable 
sets all having constant measure 4(W) > 0 and all contained in [a, b + 1], which is 
impossible since ([a, b + 1]) < oo. 

For (2), let E be jz-measurable (i.e., E € S) with w(E) > 0. Fix a < b such 
that w([a,b] N E) > 0, and put A := [a,b] N E, B := [a,b|~E so that w(A) + 
(B) = u([a, b]). Since “(A) > 0 and p is outer regular, there is open G D> B 
with u(G~ A) < p(A). Put F := [a,b]\~G. Then F is closed with F C A. 
Now [a,b] © F U(G NB) UB, so w([a,b]) < w(F) + w(G~B) + w(B) < 

MF) + WA) + u(B) = WCF) + w([a, b]), hence w(F) > 0. Since w({x}) = 2 
for all x, so F must be uncountable. 


By the Vitali—Bernstein results above, if there is an extension of Lebesgue measure 
defined on all sets of reals, then such an extension can neither be translation invariant 
nor be outer regular. But if we drop these two requirements then the measure 
extension problem remains valid. 

Banach generalized the problem further for measures defined on arbitrary sets 
(instead of [0, 1]) which satisfy the continuity condition u({p}) = 0 for all p and 
have a normalized value for the measure of the whole set: 


The Measure Problem (Banach). /s there a nonempty set X and a countably 
additive set function 1: P(X) — [0, 1] defined on all of P(X) such that u(X) = 1 
and [L({p}) = 0 for all p € X? 


We will refer to this problem of Banach as the (general) measure problem. 

The Banach—Kuratowski result was vastly improved by Ulam who did a full 
analysis of the measure problem. Among other things, Ulam showed that if the 
size of the continuum is less than the first weakly inaccessible cardinal then there 
is no extension of Lebesgue measure define on all sets of reals. Ulam’s work had 
significant implications for future research in set theory, and we will now describe 
his work in detail. 
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Let us begin with some official definitions. 


Definition 1181. By a total measure on a set X we mean a set function pu: P(X) > 
[0, co] defined on all subsets of X which is countably additive: If (E,,| 1m € N) is a 
pairwise disjoint family of subsets of X then jz (U,, Ey) =>, L(En). We also say 
that 


1. wis nontrivial if u(X) > 0 and u(@) = 0. 
2. wis finite if U(X) < oo. 

3. wis continuous if u({p}) = 0 forall p € X. 
4. wis a probability measure if u(X) = 1. 


We will focus our attention to nontrivial finite continuous total measures. Note that 
by normalizing if necessary, the existence of such a measure is equivalent to the 
existence of a continuous total probability measure—which is exactly what the 
measure problem is asking. 


Problem 1182. Let i be a finite total measure. Ifn € N then any family of pairwise 


disjoint sets each of measure = - is finite. Any family of pairwise disjoint sets of 


positive measure is countable. 


Definition 1183 (Atomless and Two-Valued Measures). Let jz be a total measure 
on X. A C X is an atom for yw if w(A) > 0 and for all E C A either w(E) = 0 or 
[L(AN E) = 0. The measure jz is atomless if there is no atom for ju, and jz is called 
a two-valued measure if X itself is an atom for ju. 


An atomless measure is continuous and a two-valued measure is nontrivial. 


Problem 1184. Let jz be a finite atomless total measure on X. Show that for any E 
with u(E) > 0 and any € > 0 there is S C E such that 0 < w(S) <. 


[Hint: If 4(£) > 0, then there is E’ C E with 0 < p(B’) < $(B). Repeat.] 


Definition 1185 (Separating Families). If E; C X for alli ¢ J then (£;|i € 7) 
is called a separating family of subsets of X if for all p 4 q in X, there isi € J 
such that x € FE; and y ¢ E;, or y € E; andx ¢ Ej. 


For example the family of intervals ((—oo,r]| 7 € Q) is a countable separating 
family for R. This can be generalized as follows. 


Problem 1186. For any cardinal £ any set of cardinality at most 2° has a 
separating family of size at most &. 


[Hint: A set X of size at most 2° can be regarded as a subset of {0, 1}4 for some 
A with |A| = &. Then (E,| a € A) is a separating family for {0, 1}4 where Ey := 
{f € {0,14 f@ = I} @e A) 

Definition 1187 (<-complete measures). Let jz be a total measure on X and k bea 


cardinal. ~ is K-complete if for any family (E; | i € I) of subsets of X with |I| <x 
and p4(E;) = 0 for alli € I we have pt (Uje, Zi) = 0. 


Note that by definition (countable additivity), every measure is 8,-complete. 
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Proposition 1188. Let 1 be a continuous k-complete total measure ona set X. If 
X has a separating family of size less than k, then \t is atomless. 


Proof. Suppose A C X is an atom for ju, and fix a separating family (£; |i € 7) 
with |I| < «. Since A is an atom, for eachi € J we have either “(AN E;) = 0 or 
[L(A E;) = 0. Define: 


eae ANE; if w(ANE;) =0 
- A~E; otherwise. 


Since (A;) = 0 for all i and yw is k-complete,  (Uj;<; Ai) = 0. The set Ax 
(ce I A;) has at most one element, so it has jz-measure zero by continuity of ju. 
But then jz(A) = 0, contradicting the fact that A is an atom. Oo 


Since R has a countable separating family, this immediately gives: 


Corollary 1189. Any continuous total measure on a set of size at most 2®° is 
atomless. 


Thus if a continuous two-valued total measure exists, it can only be defined on 
a set of cardinality larger than that of the continuum. We will see below that 
the cardinality of such a set must actually be greater than or equal to a strongly 
inaccessible cardinal! 


A useful result about atomless measures is the following. 


Proposition 1190. Let j. be an atomless total measure on X. Then there is a 
family (B,| u € {0, 1}*) of subsets of X indexed by nodes u of the binary tree 
{0,1}* satisfying: By = X, By = Byro VU Biri, Byro N Byx1 = O, and 
M(By-0) = L(By-1) = 5(B,). 


Proof. The result easily follows from the following lemma. 
Lemma. For any FE C X thereis S C E such that w(S) = 5u(E). 


Proof (Lemma). Call a family C of subsets of E to be adequate if each set in C 
has positive measure, distinct sets in C are disjoint, and whenever FE), Ex,..., Ey € 
C, we have (aa Ex) < 5H(E). By Problem 1182, each adequate family is 
countable. Now consider the collection of all adequate families partially ordered by 
inclusion, and apply Zorn’s lemma to get a maximal adequate family C. Put S := 
UC. Then w(S) < Sp(E ) (this is easily seen by enumerating C without repetition 
and applying countable additivity of 2). We claim (S) = S(E ). Otherwise we 
could use Problem 1184 to choose A C (E~S) with 0 < w(A) < SU(E) — w(S). 
Then C U {A} would be an adequate family properly extending C, contradicting the 
maximality of C and finishing the proof of the lemma. Oo 


Now construct the family (B,| u € {0,1}*) as follows: Let By := X, and having 
defined B, for u € {0, 1}* use the lemma to choose S C B, with w(S) = 5 (Bu), 
and then put B,-~9 := S and B,-; := By ~S. oO 


19.1 The Measure Problem and Measurable Cardinals 349 


We now have a “converse” to Corollary 1189. 


Corollary 1191. [f there is a k-complete atomless total probability measure ju on 
a set X of cardinality x, then k < 2%. 


Proof. Fix (B,| u € {0, 1}*) asin Proposition 1190. For eacha € {0, 1}, let Xq := 
nen Ban» 80 that (X_) = 0, and X = {X,| a € {0, 1}}. If we had 2° < x, 
then x-completeness would give jz(X) = 0, a contradiction. oO 


Corollary 1192. Let jz be a k-complete continuous total probability measure on a 
set of cardinality k. Then yu is atomless if and only if k < 2®°, 


The following theorem shows that the original measure extension problem for 
Lebesgue measure has little to do with the real numbers themselves, but is really 
a problem of abstract set theory that depends only on the cardinal number of the 
underlying set. 


Theorem 1193 (Banach-Ulam). The following are equivalent: 


1. There is a total measure 1: P([0, 1]) — [0, 1] extending Lebesgue measure. 
2. There is a total measure 4: P(R) — [0, co] extending Lebesgue measure. 
3. There is a continuous total probability measure on a set of size < 2®°. 

4. There is an atomless total probability measure on some Set. 


Proof. 1 = 2: Assume that jz: P([0, 1]) — [0, 1] is a total measure which extends 
Lebesgue measure on [0, 1]. Then 77: P(R) — [0, oo] is a total measure extending 
Lebesgue measure on R, where, for each E' C R, we define: 


co 


M(E):= >> p((0,1)N(E +n). 


n=—oo 


2 = 3: This is immediate: Restrict jz to the unit interval. 
3 = 4: Immediate by Corollary 1189. 


4 = 1: Let uw: P(X) — [0, 1] be an atomless total probability measure on X, and fix 
(B, | u € {0, 1}*) as in Proposition 1190. Each x € X determines a unique infinite 
branch {u € {0, 1}*| x € B,} through the binary tree {0, 1}* which can be identified 
with an element b, = (b,(1), by (2),...) € {0, 1}N. Now define h: X — [0, 1] and 
a total measure v on [0, 1] by: 


co 


ha = 2 ve) = wr te). 


n=1 
v is easily verified to be a total measure on [0, 1] such that v ( iS a x ) = x for 
1 <k < 2”. Thus the total measure v agrees with Lebesgue measure at all dyadic 
intervals, and so is an extension of Lebesgue measure on (0, 1]. oO 
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We now show that if the measure problem has a solution, then such a solution 
defined on a set of least possible cardinality « must be «-complete: 


Proposition 1194. Let jz be a continuous total probability measure ona set X with 
least possible cardinality k (i.e., there is no continuous total probability measure on 
any set of cardinality < «). Then jt is k-complete. 


Proof. Otherwise, we can get a well-ordered set J with |J| < « and a family 
(E;| i € I) such that 4(£;) = 0 for alli € J yet w(E) > 0 where E := Uje, Ei. 
Define f: E — I by setting f(x) := the least i € J such that x € E;. Define a 
total probability measure v on J by 


(je BAD (ACI). 


WE) ~ 


v is continuous since for any i € I we have f~'[{i}] C E;, and so v({it) = 
u(f fis) /m(E) < w(E;)/u(E) = 0. So v is a continuous total probability 
measure on the set J with |/| < x, contradicting the minimality of x. Oo 


Thus, as far as the existence of a solution to the measure problem is concerned, 
without loss of generality we can and will assume that a continuous total probability 
measure defined on set of cardinality k is kK-complete. 

Note that if 2 is continuous and x-complete, then 4(S) = 0 for any set S with 
|S| < «. As an immediate corollary we have: 


Corollary 1195. Let jz be a continuous k-complete total probability measure on a 
set X of cardinality k. Then x is a regular cardinal. 


Proof. If X = Ue, Xi with |J| < « and |X;| < « for alli € 7, continuity and 
«-completeness would give j4(X;) = 0 and so (X) = 0, a contradiction. Oo 


Theorem 1196 (Ulam). Let jz be a continuous k-complete total probability mea- 
sure on a set X of cardinality xk. Then x is a limit cardinal. Consequently, k must be 
weakly inaccessible. 


Proof. (Cf. Theorem 1157.) If possible, let k = Ny+1 be a successor cardinal. Well- 
order X with order type w,+1, and fix a set Y with |Y| = N,. For eacha € X, the 
set Predy(a) = {x € X | x < a} has cardinality < &,, so we can fix an injection 
Ja: Predy (a) > Y.Foreach x € X and y € Y, put: 


Ey := {ae X|x <aand fi(x) = y} (Ulam matrix). 


Then for each y € Y, the sets (Ex | x€ Xx) form a family of pairwise disjoint 
subsets of X, so only countably many of these sets can have positive measure, and 
hence (as X has uncountable cofinality) we can fix x, € X such that ple. ) =0 
for all x > x,. Since w¢41 is regular and |{x,| y € Y}| < No, there is p € X with 
Pp > Xy forall y ¢ Y. Then L(E>) = 0 for all y € Y, so by k-completeness of ju, 


mM (Wig E}) = 0. Hence the set {a € X | a > p} (being C Ly E>) also has 


yeY 
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jt-measure 0. But by x-completeness of jz again, the set {a € X | a < p} also has 
j4-measure 0, which is a contradiction since X cannot be the union of two sets of 
j4-measure 0. oO 


Corollary 1197 (Ulam). Jf there is a total measure extending Lebesgue measure, 
then there is a weakly inaccessible cardinal < 2™°, 


Thus if Lebesgue measure has a total extension, then 280 > ), 28 > &,..., 
280 > &.,,, etc, and so the Continuum Hypothesis is severely violated. 

Conversely, if no cardinal < 2%o is weakly inaccessible, e.g., if Qo < &.,. then 
Lebesgue measure cannot have a total extension. Observe how this dramatically 
improves the Banach—Kuratowski result, which derived the same conclusion assum- 
ing the hypothesis 28° = &. 


Finally, we consider the case of two-valued measures. The following result 
implies that a two-valued continuous total measure must be defined on a set of 
cardinality greater than or equal to some strongly inaccessible cardinal: 


Corollary 1198 (Tarski-Ulam). [fa «-complete continuous total measure jt on a 
set X of cardinality k has an atom, then x is strongly inaccessible. 


Proof. By Corollary 1195 « is regular, so it suffices to show that « is a strong limit: 
If & < « < 25, then by Problem 1186 X has a separating family of size at most 
— < x, so by Proposition 1188, jz must be atomless, a contradiction. Oo 


Combining this with Corollary 1192 culminates in Ulam’s major result: 


Corollary 1199 (Ulam). Let jz be a continuous k-complete total probability mea- 
sure on some Set of cardinality k. Then we have: 


(A) « < 28° & pis atomless © x is not strongly inaccessible. 


(B) «> 28 & pwhasanatom xk is strongly inaccessible. 


Moreover, if any (and so all) of the conditions in (A) holds, then Lebesgue measure 
has a total extension, and if any (and so all) of the conditions in (B) holds, then a 
two-valued continuous total measure exists. 


Note the dichotomy involved here: If x is as above, then either « < 2®° or « > 2%o 
but not both; hence either all of the equivalent conditions in (A) hold, or else all of 
the equivalent conditions in (B) hold, but not both. 


Definition 1200 (Real Valued Measurable Cardinals). A cardinal x is called a 
real valued measurable cardinal if there is a continuous k-complete total probability 
measure on some set of cardinality «. 


Thus a real valued measurable cardinal exists if and only if the measure problem has 
a positive solution. Every real valued measurable cardinal is weakly inaccessible 
(and so cannot be shown to exist using usual axioms). 
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Definition 1201 (Measurable Cardinals). A cardinal « is called a measurable 
cardinal if there is a two-valued continuous «-complete total measure on some set 
of cardinality x. 


It follows that a cardinal is measurable if and only if it is real valued measurable and 
strongly inaccessible. Measurable cardinals turn out to be “very large.” For example, 
it can be shown that if « is measurable then « is not only weakly compact, but is 
also preceded by «-many weakly compact cardinals. 


Problem 1202. Every measurable cardinal has the tree property. 
Ulam’s results can now be stated in terms of measurable cardinals as follows. 
Corollary 1203 (Ulam). 


(A) A total extension of Lebesgue measure exists 
> there is a cardinal which is real valued measurable but not measurable 
& there is a real valued measurable cardinal < 2®°, 
(B) If Lebesgue measure has no total extension, then a nontrivial continuous total 
measure on some Set exists <> there is a measurable cardinal. ! 


Ulam’s definitive work above was the first example of a natural problem of 
classical mathematics whose solution is equivalent to a large cardinal hypothesis. 
Measurable cardinals gave birth to the field of large cardinals, and have considerable 
implications for several areas of mathematics. Theorem 1206 below is one such 
application. Postscript [V (Chapter 22) describes surprising connections between 
measurable cardinals and infinite games (Theorem 1316). 


19.2 Projective Sets and Lusin’s Problem 


In Chap. 18, we studied Borel and analytic sets. These sets are quite effectively 
defined, and we saw that they enjoy many regularity properties, such as being 
Lebesgue measurable, having the Baire property, and the perfect set property. The 
subject area dealing with the study of such effectively defined sets is known as 
descriptive set theory, which originated in the work of Borel, Baire, Lebesgue, 
Lusin, Sierpinski, Suslin, Hausdorff, etc. 

It can be shown that analytic sets are precisely the continuous images of Borel 
sets. Using this idea, Lusin defined larger classes of sets called projective sets 
as follows. The analytic sets were called A sets and the coanalytic sets CA sets. 
Continuous images of coanalytic sets were called PCA sets, and complements of 
PCA sets CPCA sets, and so on. 


'Solovay proved that the relative consistency (with the usual axioms) of each of the alternatives of 
Corollary 1203 implies the relative consistency of the other. 
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In modern notation, introduced by Addison, the projective hierarchy is defined 
recursively as follows. 


Definition 1204 (The Projective Hierarchy). E is a Xj set if E is an analytic set, 
and for eachn €N, E isa Z} 4, Setif E is the continuous image of the complement 
of some } set. 

A II} set is one whose complement is a X} set, and a A) set is one which is both 
y! and I. 

A set is projective if it is y) (or equivalently TI} or A ) for some n. 


Thus a ber set is simply an analytic set, a Il; set is nothing but a coanalytic 
set, and Ai sets are those which are both analytic and coanalytic. By Suslin’s 
theorem, therefore, the Ay sets coincide with the class B of Borel sets. Also, we 
have ©) UM} C Aja, © E)4, 911) 4, so we have a hierarchy of ever more 
comprehensive collection of sets. It can be shown that this hierarchy increases 
properly as n increases, and so we get the following picture: 


Thus ©) sets are precisely the PCA sets and II} sets the CPCA sets, etc. The 
projective sets are thus example of effectively defined sets and the total number 
of projective sets is 28°, (The reason they are called “projective” is that they can 
alternatively defined by taking projections, instead of continuous images, of sets in 
higher dimensional Euclidean spaces.) 

Classical descriptive set theory obtained the following major results, some of 
which we proved in Chap. 18: 


Theorem 1205 (Lusin-Sierpinski-Suslin). 


1. Every & : (analytic) set has the perfect set property, is Lebesgue measurable, and 
have the Baire property. 

2. Any two disjoint Z| sets can be separated by a Ai set, and any two disjoint 11} 
sets can be separated by a A) set. 

3. Any Zz set (PCA set) can expressed as a union of &, Borel sets. 


There were some other structural properties that were established about II} and 
sets (called reduction and uniformization that we have not covered), but little else 
was known about the higher projective classes. 

Lusin and other mathematicians tried to obtain similar properties for the next 
levels of the projective hierarchy. They asked the following questions, which we 
will collectively refer to as Lusin’s Problem: 


Lusin’s Problem. What are the regularity and structural properties for the higher 
level projective classes? In particular: 
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1. Does every uncountable Il; (coanalytic) set contain a perfect subset? 
2. Is every > ah set Lebesgue measurable? 
3. Does every = set have Baire property? 


Despite major efforts by all the leading descriptive set theorists of that period, none 
of the above questions could be answered, and essentially nothing about the higher 
level projective classes could be said. 

This caused Lusin to remark that “one does not know and one will never know” 
the answer to the above questions about projective sets, even though projective sets 
are effectively defined and form an infinitesimally small part of P(R). As we will 
see, Lusin’s remark turned out to be prophetic. 


Independence Results for Lusin’s Problem 


The first results that partially explained why Lusin’s Problems could not be solved 
came from Godel, who showed [22] that one cannot prove that PCA (3) sets are 
Lebesgue measurable or that coanalytic sets have the perfect set property using the 
usual axioms of set theory (provided these axioms are consistent). Gédel’s work 
will be briefly discussed in Postscript 1V. Much later, Solovay and Martin showed, 
by extending Cohen’s technique of forcing, that one cannot disprove that PCA 
sets Lebesgue measurable (or that they have Baire property). What they showed is 
that MA+not-CH is relatively consistent with the usual axioms of set theory. Since 
MA+not-CH combined with Sierpinski’s result Theorem 1205(3) implies that every 
= set is Lebesgue measurable and has Baire property, the relatively consistency of 
the latter also follows. Moreover, Solovay also showed, assuming the consistency 
of existence of an inaccessible cardinal, that one can consistently assume that every 
projective set is Lebesgue measurable, has the Baire property, and has the perfect 
set property (see Theorem 1308). 

The Goédel—Martin-Solovay independence results showed that all of Lusin’s 
Problems are essentially unsolvable using the standard axioms of mathematics, thus 
fully confirming Lusin’s prediction. 


19.3. Measurable Cardinals and PCA (x 3) Sets 


Surprisingly, Solovay also showed that the existence of measurable cardinals does 
resolve Lusin’s Problem, and in a desirable positive way: 


Theorem 1206 (Solovay). If there is a measurable cardinal, then all = sets have 
the perfect set property, are measurable, and have the Baire property. 


After Ulam’s settlement of the measure problem, this was another remarkable 
example of a large cardinal axiom resolving an unsolvable problem of classical 
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mathematics, and raised the hopes of discovering large cardinal axioms which may 
be added as new axioms of mathematics. At the same time, Silver showed that 
Lusin’s Problems for projective classes of levels higher than 6 cannot be resolved 
using measurable cardinals alone. 

Solovay, however, conjectured that stronger large cardinal axioms might resolve 
the problems for the projective sets, and the search for such large cardinals—or 
other possible axioms—became one of the greatest problems of modern descriptive 
set theory. Outstanding work by many people culminated in a truly remarkable 
resolution of the problem. But this is a topic for Postscript IV, where we will discuss 
connections between large cardinals and determinacy of infinite games. 


Goldring [25] is a very accessible survey of the topics of this postscript. 


Part IV 
Paradoxes and Axioms 


Introduction to Part IV 


In Parts I-III of this book we developed cardinals, order, ordinals, and real point set 
theory, and also indicated how these relate to some classical areas of mathematics. 
We carried out that development in an informal and naive way as we would do for 
any standard area of mathematics such as geometry, exploring structural details and 
obtaining views and intuitions about the subject matter. In this sense, Parts I-III of 
the book were purely mathematical. 

The naive theory of sets, however, can lead to contradictions unless suitable 
restrictions are placed on the simple principles forming the basis of the theory. This 
requires a careful scrutiny of the logical foundations of set theory. To stay focused 
on the mathematical aspects of our topics, we had so far avoided getting into this 
metamathematical problem. 

In this part, we will give an overview of such logical and foundational matters, 
starting with some famous contradictions of naive set theory and two early responses 
to them (Chapter 20). Our coverage will necessarily be very elementary and 
introductory, and we will refer the reader to more comprehensive works for further 
details. 

In Chap. 21, we briefly present Zermelo—Fraenkel set theory (ZF) and the 
von Neumann ordinals, providing only bare outlines for the formal development 
of some of the basic notions of set theory such as order and cardinals. However, the 
reader who has mastered the theories of numbers, cardinals, ordinals, and the real 
continuum developed in Parts I-III, will find the re-development of all these theories 
within the formal framework of ZF a relatively routine matter, and we encourage the 
reader to take up this project of replicating the results of Parts I-III formally in ZF. 

Finally, the postscript to this part (Chapter 22) provides glimpses into some 
landmark results of set theory of the past 75 years. 
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Chapter 20 
Paradoxes and Resolutions 


Abstract Unless carefully restricted, the informal naive set theory that we have so 
far been using can produce certain contradictions, known as set theoretic paradoxes. 
These contradictions generally result from consideration of certain very large sets 
whose existence can be derived from the unrestricted comprehension principle. 
This chapter discusses three such classical paradoxes due to Burali-Forti, Cantor, 
and Russell, which showed the untenability of naive set theory and the need for 
more careful formalizations. The two earliest responses to the paradoxes, namely 
Russell’s theory of types and Zermelo’s axiomatization of set theory, are discussed. 


20.1 Some Set Theoretic Paradoxes 


The Burali-Forti Paradox 


One of the oldest paradoxes of set theory is the Burali-Forti paradox. It shows that 
a contradiction can be derived from the assumption that there is a set containing all 
ordinals. 


Proposition 1207 (Burali-Forti’s Paradox). The assumption that there is a set of 
all ordinals leads to a contradiction. 


Proof. If there were a set §2 consisting of all ordinals, it will be an initial set of 
ordinals, so we will have 2 = {6 |B < a} where a is the order type of 2 (recall 
that any set of ordinals is well-ordered). Since S2 contains all ordinals, we will have 
a € 2 = {B|B <a}, whence a < a, a contradiction. Oo 


When viewed as a proof by contradiction, this result can be put in the following 


form: The set of all ordinals does not exist. 


The above proof is so simple and clear that it already forces us to doubt 
unrestricted comprehension. By unrestricted comprehension there is a set of all 
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ordinals, and by Burali-Forti’s theorem, there is no set of all ordinals. Since the proof 
of Burali-Forti’s theorem is highly rigorous while the unrestricted comprehension 
axiom of uses the vague notion of “arbitrary property applicable to any object 
whatsoever,” this causes us to question the axiom. 


The Cantor—Russell Paradoxes 


Cantor’s paradox result from the assumption that there is a set containing all sets: 
By Cantor’s theorem there is no largest cardinal since 2“ > x for any cardinal x, but 
on the other hand the cardinality of the set of all sets must be the largest cardinal. 


Proposition 1208 (Cantor’s Paradox). The assumption that there is set contain- 
ing all sets leads to a contradiction. 


Proof. If there were a set V containing all sets, we would have P(V) C V, hence 
| P(V)| < |V|, contradicting Cantor’s theorem that | P(V)| > |V]. Oo 


When viewed as a proof by contradiction, Cantor’s paradox can be put in the 
following form: There is no set containing all sets, or, as famously paraphrased 
by Halmos: “Nothing contains everything.” 


The following problem gives a result closely related to Cantor’s paradox. 


Problem 1209. The assumption that there is a set containing all cardinal numbers 
leads to a contradiction. 


[Hint: Use Hartogs’ theorem and the fact that for any ordinal a there is 6 > a with 
op = B.] 
Cantor’s paradox is based on Cantor’s theorem, which we present again in the 


following “diagonalization” form: For any function F with domain X, there is a 
member of P(X) which is not in ran(F). 


Theorem 1210 (Cantor Diagonalization). If X is any set and F is any function 
with domain X, then there is a subset of X not in the range of F. 


Proof. Take the Cantor diagonal set for the function F’, namely: 
D:={x €X|x € F(x)}. 

If we had D = F(a) forsomea € X, thenwe getae DoSa¢ F(a) Sag D, 

which is a contradiction. Oo 


The famous Russell’s paradox is closely related to Cantor’s theorem above. In fact, 
as Russell comments in [69, p. 58], and as we will see now, it is obtained as a special 
case when the function F is taken to be the identity function on the set of all sets. 
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Given any set X, the Russell set Rx for X is defined to be the Cantor diagonal 
set for the identity function on X: 


Ry := {x € X| x € x}. 


Hence, by Cantor’s theorem in the form stated above, for any set X, its Russell set 
Ry cannot be in the range of the identity function on X, i.e., Ry is not a member 
of X.! 


Proposition 1211 (Property of the Russell Set). For any set X, its Russell set 
Ry := {x € X| x ¢ x} is nota member of X. 


Call a set x to be normal if x ¢ x (most sets encountered in practice are normal). 
Thus the Russell set Ry of X consists of all normal members of X. Russell’s 
paradox is obtained by taking the set R of all normal sets; in other words, R is 
the Russell set of the set of all sets, i.e., R = Ry, where V is the set of all sets. 


Proposition 1212 (Russell’s Paradox). The assumption that there is a set R 


consisting of all normal sets leads to a contradiction.” 


Proof. We have R € R + Ris normal > R ¢ R. Oo 


Thus Russell’s Paradox, obtained by applying Cantor’s theorem to the identify 
function defined on the set of all sets, results in a contradiction of a strikingly 
simple form. A popular version of Russell’s paradox talks of a certain barber in 
a certain town who shaves those and only those who do not shave themselves. We 
then get a contradiction since the assumption that the barber shaves himself leads to 
its negation and vice versa. Viewed as a proof by contradiction, this means that such 
a barber cannot exist. 


Impact on the Frege—Russell Logicist Program 


The goal of the original logicist program—pioneered by Frege during 1879-1903 
and championed by Russell—was to develop mathematics purely from logic using 
the central notion of the extension of a property (or of a concept). Roughly speaking, 
the extension of a property P is what we would now call the set of all objects having 
the property P, embodied in the axiom of extensionality: If two properties P and 
Q satisfy the condition that x has property P if and only if x has property QO for 
all objects x, then P and Q have the same extension (i.e., they determine the same 
set). In addition to the axiom of extensionality, the logicist system used the axiom of 


'This is equivalent to saying X U {Rx} is a proper superset of X. Under the Axiom of Foundation, 
Ry equals X itself, and so the Axiom of Foundation implies that the set X U {X}, called the 
successor set of X, is a strictly larger set. 


The paradox was also discovered independently by Zermelo (unpublished). 
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unlimited comprehension, which says that every property P has an extension (i.e., 
the set of all objects satisfying P exists). The logical deductive system based on 
these two axioms—which we will call the naive Frege—Russell logicist system— 
was used to develop significant bodies of mathematics (such as arithmetic and the 
theory of cardinals), until Russell discovered his paradox in 1901 showing that the 
naive Frege—Russell logicist system was inconsistent and therefore must be modified 
in some way or other. Remarkably, Russell reported his paradox in a famous letter 
to Frege in 1902, precisely when the second and final volume of Frege’s completed 
development of the system was in press for publication. 

Russell’s paradox is perhaps the simplest one among all set-theoretic para- 
doxes. It can be quickly derived from unrestricted comprehension using the 
set-membership relation alone, without any need for using more advanced defined 
notions such as ordinals, cardinals, or the power set. In addition to the abandonment 
of the original naive Frege—Russell logicist system, it led to the permanent prohibi- 
tion of the use of the unrestricted comprehension principle, and thus to revisions of 
the methods for new set formation. 


Resolution of the Paradoxes 


The set theoretic paradoxes almost invariably resulted from consideration of very 
large collections such as the collection of all sets or the collection of all ordinal 
numbers. The axiom of extensionality was uncontroversial, but as pointed out above, 
unlimited comprehension was highly suspect, and it soon became clear that this 
axiom had to be modified. 

It was also clear that the informal nature of the naive set theory of Cantor and 
Frege carried risk of generating contradictions, and if contradictions were to be 
avoided then a more careful formalization of the principles of set theory was needed. 
Several such formal approaches developed over the years, and we now discuss the 
two earliest responses to the paradoxes, the first by Russell himself, and the second 
by Zermelo, the other mathematician who had independently encountered Russell’s 
paradox. 


20.2 Russell’s Theory of Types 


The first proposed way to address the paradoxes was introduced by Russell himself 
in his 1903 book The Principles of Mathematics [66]. Russell called his solution the 
Theory of Types, and developed and extended it fully in 1908 [67] and 1910 [81]. 
The theory of types was later modified and considerably simplified by Ramsey, 
Carnap, Tarski, Gédel, Church, and Quine. We will give a very rough description of 
type theory in its simplest form. 
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The theory of types classifies objects into a hierarchy according to their logical 
type. Objects which are not sets, called individuals, or atoms, have type 0. Sets 
consisting of individuals alone are of type 1. Sets consisting of sets of type 1, 1.-e., 
sets of sets of individuals, are of type 2, and so on. In general, the set of objects of 
type n-+1 is simply the “power set’ of the set of objects of type 7. The main principle 
of type theory is that the expression “x € y” cannot be meaningful unless we have 
type of y = 1+ type of x. Thus the expression “x € x,” required in formulating 
Russell’s paradox, is simply meaningless. Moreover, a set of type n + 1 can only 
contain objects of type , and so objects of distinct types cannot be mixed. It follows 
that the collection of all sets is also meaningless. This results in a resolution of 
Russell’s paradox and other paradoxes involving the set of all sets such as Cantor’s 
paradox. One immediate oddity of this theory, however, is the “duplication of the 
empty set” across various types: There is an empty set of type 1, an empty set of 
type 2, etc. This may appear to be a violation of the axiom of extensionality, but 
note that in type theory there is a separate axiom of extensionality for each type! 


Principia Mathematica 


Russell and Whitehead’s Principia Mathematica (or PM), published in three 
volumes [81], was a revival of the original naive Frege—Russell logicist program 
of developing mathematics deductively from a few “logical” axioms. Using a more 
careful revision (based now on type theory) of the axioms of the naive system, it 
made a heroic attempt to demonstrate that mathematics is an extension of logic. 
More specifically, it deductively developed portions of mathematics starting from 
axioms which they claimed to be principles of logic itself. 

Principia Mathematica served as a reference point for the fact that mathematics 
can be deductively developed from a few axioms, logical or not, and had a 
tremendous impact on the development of set theory and logic in the subsequent 
years. However, while it succeeded in demonstrating that portions of mathematics 
can be developed from a few axioms, criticisms were raised that it failed in its 
logicist program. Some of the axioms used did not appear to be logical. The axiom 
of infinity, which states that there is an infinite set, appears more quantitative 
than logical. Most significantly, PM needed a special axiom called the axiom of 
reducibility, which did not appear to have any logical or intuitive justification at all, 
and was rejected even by other supporters of the logicist program. 

Because of the complexities involved and the resulting need for the axiom of 
reducibility which had little justification, the form of type theory in PM, known as 
ramified type theory, never gained acceptability. Later work by Ramsey, Carnap, 
Tarski, Godel, Church, Quine,* etc., resulted in a more satisfactory simple theory 


Type theory also lurks behind Quine’s New Foundations (discussed in Sect. 21.9), where type- 
distinction plays a basic role in forming sets via comprehension. 
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of types, but type theory itself is not popular for the deductive development of 
mathematics. Far more successful has been Zermelo’s axiomatic formulation of 
1908, where all kinds of sets, including those having distinct hierarchical ranks, 
can be freely mixed. 

Although not popular in axiomatic developments of mathematics, type theory, 
nevertheless, has had far reaching implications for other areas such as logic, 
philosophy, and computability. For example, with its close relation with Church’s 
lambda-calculus, type theory today finds significant applications in modern com- 
puter science. 


20.3 Zermelo’s Axiomatization 


In 1908, Zermelo [85] introduced a completely different set of axioms in which the 
objects of the theory form a single domain consisting of all sets (and possibly also 
individuals, or atoms, which are not sets). Unlike type theory, there is no longer 
an a priori classification of objects into individuals, sets, sets of sets, etc. Instead, 
all these objects (having “mixed types’) are put together in the single domain and 
are regarded to be of the same sort to start with. Along with this domain of all 
sets the only other primitive notion used in Zermelo’s system is the relation of set 
membership, where x has this relation to y if and only if x € y. 

Zermelo’s system was the first formulation of axiomatic set theory in the modern 
sense of the term. With its single domain of objects containing sets of different types 
freely mixed together, it has a considerably simpler setup than Russell’s type theory. 

A most important feature of Zermelo’s system is the removal of the unrestricted 
comprehension principle. Instead of forming sets defined by arbitrary properties, 
Zermelo’s system limits comprehension by only allowing formation of subsets of 
a set which is already known to exist. This restricted axiom of comprehension 
says that given a set A and a property P, we can form the subset {x | x € 
A and P(x)} of A. This subset is said to be separated out of A via the property P. 
Thus Zermelo’s axiom of restricted comprehension is also known as the axiom of 
separation (Aussonderungsaxiom) or as the axiom of subsets. This limited principle 
of comprehension prevents the entire domain of all sets from being a set itself, 
avoiding contradictions involving “the set of all sets.” 

Of course, with such a limited form of comprehension, additional axioms are 
needed to form new sets. One such axiom allows to form the power set of any 
given set. Even from assuming only the existence of the empty set, one can iterate 
the formation of power sets to get more and more sets, giving us a rich supply of 
finite sets. Thus, another important feature of Zermelo’s system is that it forms new 
sets by building them from ground up in stages, rather than by sweeping uses of 
comprehension. There are other axioms for forming new sets, such as the axiom of 
union, which states that we can form the union of the members of any given set 
(of sets). Since we will discuss these and other axioms in more detail in Chap. 21, 
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we finish this short discussion of Zermelo’s axiomatization with some comments on 
subsequent modifications of Zermelo’s system. 

Zermelo’s system was later improved by Skolem (who made the vague notion of 
“definite property” precise by replacing it with the formal syntactic notion of “first- 
order formula’), and was made more comprehensive by Fraenkel (who introduced 
the axiom of replacement necessary for the development of the theory of the 
transfinite). The enlarged system which is now standard is sometimes called the 
Zermelo—Skolem—Fraenkel system, but we will follow current usage and call it the 
Zermelo—Fraenkel system, abbreviated as ZF set theory. This notation assumes that 
the axiom of choice is not included in ZF, and ZF augmented with the axiom of 
choice is called ZFC, or Zermelo—Fraenkel set theory with Choice. 

ZFC has turned out to be the most popular formulation of axiomatic set theory, 
and has become the standard axiomatization of set theory today. It is a very powerful 
(perhaps too powerful) system certainly capable of providing a framework for all of 
mathematics: All mathematical statements can be expressed in its language and all 
theorems of classical mathematics can be proved in ZFC. 

Van Heijenoort [78, p. 199] illustrates the striking difference between the two 
early responses to the paradoxes by contrasting the pragmatic foundation for math- 
ematics provided by Zermelo’s axiomatization with the wide logico-philosophical 
ramifications of Russell’s type theory. 


Chapter 21 


Zermelo—Fraenkel System and von Neumann 
Ordinals 


Abstract In this chapter we present the Zermelo—Fraenkel axiom system which is 
an enhancement of Zermelo’s 1908 system by Fraenkel, Skolem and others, as well 
as the von Neumann ordinals, which assigns, in a remarkably simple, constructive, 
and canonical fashion, a unique representative well-order to each ordinal number. 


21.1 The Formal Language of ZF 


For a precise formulation of the ZF axioms we first need to formalize its language, 
which we call the language of ZF set theory. 

Expressions in the language of ZF set theory will be certain strings of a specific 
group of symbols. 

First we will need symbols for variables, for which we will use letters such as 
a,b,c,...,U,V,W,X, y, Z, etc. Next we need logical connectives, namely — (not), V 
(or), A (and), — (implies), and <= (if and only if). The other type of logical symbol 
we will need are the quantifiers, namely V (for all) and J (there exists). For grouping 
expressions, we will also use the two special symbols “(’ and “)’” (parentheses). 

The above types of symbols are called logical symbols, and widely used in 
mathematical contexts. 

The most important symbol—the only nonlogical primitive symbol in the 
language of ZF set theory—will be the symbol “e,” representing set membership. 

Using “e,’ we can form atomic formulas such as x € y, u € u, etc, where 
the letters are used for variables ranging over sets. By combining atomic formulas 
using connectives and using quantifiers over them, we can form general formulas of 
arbitrary complexity such as: 


u¢gu, Vy(y¢x), VezEex>zeEy), ete. 
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Formally, a ZF formula is any expression (i.e., a string consisting of the above types 
of symbols ) that can be built starting from the atomic formulas using a finite number 
of applications of the following formation rules: 


1. Any expression of the form “u € v” is a ZF-formula, where u and v are variable 
letters (any atomic formula is a ZF-formula) 

2. If a, 8 are ZF formulas and v is a variable letter, then each of the following is a 
ZF formula: 


(a), (%) A (B), (a) v (B), (or) > (B), (@) <> (B), Yv(@), Svar). 


The occurrence of a variable within a formula becomes bound if it is quantified 
by some quantifier (more precisely if it falls under the scope of a quantifier). The 
occurrence of a variable which is not quantified by some quantifier of the formula 
is called free. For example, in the formula below (which says “x is a subset of y”): 


VezEx—>zey), 


all occurrences of the variable z are bound, while the variables x and y are free. In 
the formula 


Ay(y Ex) AVx(x ¢ x), 


the first occurrence of x is free, but all other occurrences are bound. Thus a variable 
may be both free and bound in a formula. 

This language is used to formulate ZF set theory as a first order theory, in which 
all variables range over—and so all quantification are limited to—a single domain 
of objects consisting of pure sets. Thus in ZF, ‘Vx(...)’ is interpreted as “for all 
sets x,...,” ‘dx(...)’ as “for some set x,...,” and ‘x € y’ as “set x is a member of 
set y.” 

With this overview of the language of ZF set theory, we now turn to the ZF 
axioms. 


21.2 The First Six ZF Axioms 


ZF 1 (Axiom of Extensionality). VxVy(Vz(zEx<@~zE€ y)>x=y). 


ZF 2 (Axiom of Empty Set). There exists a set which has no members: 
Ax y(yEx). 


By extensionality, there is a unique set which has no members, and this set (the 
empty set) is denoted as usual by @. 
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At this point, we can define in ZF the subset relation as: 
xCyoVezEex>ZzeEy), 


and basic properties of the subset relation can be established in ZF: For all a,b,c 
we have @ Ca,a Ca,anda CbhandbCc>aCe. 

One crucial axiom of ZF is the axiom of separation or the axiom of restricted 
comprehension (Zermelo’s Aussonderungsaxiom). It is sometimes called the axiom 
of subsets since the axiom states that, given a set a and a “definite property” P, we 
can form the subset of a defined by 


{x| x €aand P(x)} (Restricted Comprehension). 


This is thus an axiom scheme, that, is an infinite list of axioms, one for each definite 
property P. 

Contrast this with the old naive unrestricted comprehension which allows 
forming a set 


{x| P(x)} (Unrestricted Comprehension) 


for any property P, without requiring that the elements of the set being formed be 
restricted within some set a already known to exist. This allowed the formation of 
sets like “the set of all sets” by taking P(x) to be ‘x = x’, or “Russell’s set of 
all sets not containing themselves” by taking P(x) to be ‘x ¢ x’, which lead to 
paradoxes. Separation (restricted comprehension) is designed to avoid such large 
problematic sets, see Theorem 1213 below. 

The term “definite property” was used in Zermelo’s original paper of 1908, but 
it was not precisely defined. One of Skolem’s many important contributions to 
axiomatic set theory was to make the vague notion of “property” precise: In each 
instance of the separation scheme, the expression “ P(x)” is taken to be any formal 
ZF formula of set theory in which x occurs free. 

We will say that P is a ZF property if P is a ZF formula with a specified free 
variable, and we use the notation P = P(x) to indicated that the specified free 
variable is x. If P is a ZF property, then we will also use the notation P(a) to denote 
the formula which results from P when the specified free variable is replaced by a. 

More generally, other variables may occur free as well in the defining property 
P,, and these variables are then called parameters, as indicated in the notation P = 
P(x, Y1,Y2,--+, Yn). For example, 


{x| x € a and P(x, y1, y2,---, Yn)} 


is the subset of a defined by the property P = P(x, y1, y2,...,¥n) with the 
parameters yj, y2,..., ¥n. Thus we have the following precise formulation of the 
separation axiom scheme due to Skolem. 
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ZF 3 (Axiom Scheme of Separation). [f (x, V1, y2,..-,.¥n) is a ZF formula in 
which the free variables are among X,¥\,V2,.--,Yn, then the following is an 
instance of the separation scheme: 


VyiVyo-+- Wy, Vad bVx(x €b > x CAA Q(X, V1, V2,---5Yn)): 


With just these three axioms, we can now turn Russell’s paradox into a theorem of 
ZF which says that there is no set containing all sets: 


Theorem 1213 (ZF). -4xVy(y € x). 


The proof is left as an exercise. 

The only set whose existence can be proved with these three axioms is the empty 
set, and so we clearly need more axioms for building new sets. Two more axioms 
are: 


ZF 4 (Axiom of Power Set). VxdyVz(z € y < z C x): For any set X, there isa 
set y consisting precisely of the subsets of x. 


ZF 5 (Axiom of Union). VxdyVz(z € y < Jw(w € x Az € w)): For any set x, 
there is a set y consisting precisely of the members of members of x. 


By extensionality, the set y in the power set axiom is uniquely determined by x, and 
so we can define the usual notion of power set. We let, as usual, P(x) denote the 
power set of x. Similarly the set y in the union axiom is uniquely determined by x, 
and will be denoted using the usual notation Ux. 

We will freely use the notations P(x) and Ux, but note that these defined notions 
are not part of the ZF language and can be formally eliminated. Thus ‘y = P(x)’ is 
expressed in the language of ZF set theory as ‘Vz(z € y — Vu(ue z > UE x))’, 
and ‘y = Ux’ as ‘Vze(zEe y > du(zeuAuex))’. 

Another defined notion is that of the singleton {x} of x, which is defined formally 
asy={x}o Vezeyoz=xX). 


Problem 1214. Formulate and prove in ZF the assertion that for any a, the 
singleton set {a} exists and is unique. 


[Hint: Note that {a} C P(a), and then use separation. ] 
Starting from the empty set, we can iterate the power set operation to get larger 
and larger sets, as in 


OC P(O) Cc P(P(O)) C--- 
Let us put: 
Vo = QO, and Vin+1 = P(Y,). 


Note that if V,, has m elements then V,,4; has 2”” elements. 
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Problem 1215. List all elements of V4. 


From the axioms introduced so far, one can derive that each set V,, (and in fact every 
member of V,,) exists, but the infinite collection U, V,, itself cannot be shown to exist 
without using the axioms of Replacement and Infinity to be introduced later. 

Another useful axiom is the Axiom of Unordered Pairs also called the Axiom of 
Pairing, which says that for any x, y we can form the unordered pair {x, y}. This 
amounts to VxV ydz(z = {x, y}), which in turn can be expressed as a ZF formula 
by replacing z = {x, y} with ‘VwWwezow=xVw=y): 


ZF 6 (Axiom of Unordered Pairs). VxVydzVw(w €zew=xVw=y). 
Problem 1216. Using the ZF axioms introduced so far, show that for any a, b,c 
the unordered triple {a,b,c} exists. That is, it is a theorem of ZF that 


VaVbVcazWw(w € ze w=aVvw=bvwe=c). 


We now have enough axioms to develop most of the important notions of set theory, 
although infinite sets cannot yet be shown to exist. Using Pairing and Union, we can 
define a U b simply as Ufa, b}, while a M b and a~b can be defined more simply 
using Separation (once again, these defined notions can be formally eliminated). 


Definition 1217. We define in ZF: 
aUb:= U{a,b}, anb:={xea|x eb}, axnb:={x €a|x ¢Dd}, 


Repeatedly using Pairing shows that for each a, b the set {{a}, {a, b}} exists and is 
unique, which gives the definition of ordered pair due to Kuratowski: 


Definition 1218 (Ordered Pair). (u,v) := {{u}, {u, v}}. 


Problem 1219. Prove that this is an acceptable definition of ordered pair, i.e., show 
in ZF that (a,b) = (c,d) > (a=cAb=aA). 


As in the case of the singleton and the unordered pair, the defined notion of ordered 
pair can be eliminated in the sense that “x = (u, v)” can be replaced by a pure ZF 
formula. 

Note that (a,b) € P(P({a, b})), so we can use Separation to define the Cartesian 
product of two sets. 


Definition 1220. a x b := {x € P(P(a Ub))| Ju, vue aAVEDAX = (u,v))}. 


Now we can define relations, functions, domains, ranges, equinumerosity between 
sets, equivalence relations and partitions, order relations, order-isomorphisms, well- 
orders, etc., using the ZF axioms we have so far. 


Definition 1221. The following notions can be formally defined in ZF: 


1. Ris arelation + dadb(R Ca xb) 
2. R77} := {(y, x) € P(P(UU R))| (x, y) € R}. 
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3. RoR := {(x,z) € P(P(UU R))| Jy((x, y) € RA (y,z) € RD}. 
4. dom(R) := {x € UU R| dy((x, y) € R)}. 
5. ran(R) := dom(R7!). 
6. Ris symmetric + R7! C R. 
7. Ris asymmetric R7'NR=@. 
8. Ris transitive< RoRCR. 
9. Ay := {z€ Ax Al] Ax € A(z = (x, x))}. 
10. RisarelationonA «=~ RC Ax A. 
11. Ris reflexive on A <+ Ris arelationon A A Ay C R. 
12. R is irreflexive on A > Risarelationon AA Ay,NR=@. 
13. Ris connectedon A <AxACRUR!UA,. 
14. E is an equivalence relation on A = E is symmetric, transitive, and reflexive 
on A. 
15. R orders A = R is asymmetric, transitive, and connected on A. 
16. R well-orders A < R orders A and VB C A(B #4 © > 3b € B(HAc € 
B((c.b) € R))). 
17. f isafunction <> f is arelation AVx, y,z((x,y) € fA(x,z)€ ff >y =z). 
18. f is one-one < both f and f~! are functions. 
19. Ax B = df (f is one-one A dom(f) = A A ran(f) = B). 
20. (A, R) = (B,S)<= RCAXAAS C BXBAFf(f is one-one A dom(f) = 
Aaran(f) = BAY (u,v), (x,y) € f((u,x) € R< (v,y) € S)). 


Problem 1222. Using the axioms of ZF introduced so far, prove in ZF the principle 
of transfinite induction for well-orders formalized as follows: If R well-orders A, 
BCA, and 


Vx € A((Vy € A((y,x) Ee Ro ye B)>xeEB), 
then B = A. 


Problem 1223. Formulate and prove in ZF the comparability theorem for well- 
orders that was proved informally in Theorem 636: For any two well-orders one 
must be isomorphic to an initial segment of the other. Use only the axioms of ZF 
that have already been introduced, 


More basic results as above about sets and orders that were proved informally in 
the initial parts of this book can be replicated formally in ZF using the axioms we 
have so far. We leave it to the reader to pursue this project: Find and prove (in ZF) 
as many basic results as possible from the axioms of ZF we have introduced so far, 
while defining and developing (formally in ZF) appropriate notions necessary for 
those results. 


Define a well-order X to be an infinite well-order if either X or some initial 
segment of X is a nonempty order without a greatest element. Recall that a set is 
Dedekind infinite (or reflexive) if there is a one-to-one mapping of the set into one 
of its proper subsets. Here are the formal versions of these notions in ZF. 
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Definition 1224 (ZF). (A, R) is an infinite well-order if and only if 


R well-orders A A A#@ A ([Vx € Ady € A((x, y) € R)] 
V [Az € A(Vx € A((x.z) € R > dy € A((x, y) € RA (y,z) € R))))). 


A is Dedekind infinite (or reflexive) if and only if 
Af(f isa bijection A dom(f) = A A ran(f) C A A Axran(f) 4 @). 


Problem 1225. From the ZF axioms introduced so far, show that there is an infinite 
well-order if and only if there is a Dedekind infinite (reflexive) set. 


The development of ZF so far should already illustrate the ZF-paradigm “every- 
thing is a set”: In ZF, the only type of objects that exist are sets. Ordered pairs, 
functions, and well-orders have all been shown to be sets in ZF, and all theorems of 
ZF are, in the end, results about sets and membership. 

However, not everything that were done informally in Parts I-III can be formally 
developed in ZF yet. Numbers, in particular cardinals and ordinals, cannot be 
formed yet in full generality, and existence of infinite sets cannot be established. For 
these we will need the axioms of replacement and infinity which will be introduced 
in the next sections. But once those axioms are available, the formal development of 
ZF becomes capable of realizing the ZF-paradigm “everything is a set’ to its fullest 
extent: Not only all types of numbers, such as natural, rational, real, and complex 
numbers, but also all objects of higher mathematics such as algebraic structures 
and spaces, can be constructed as sets. In fact, all mathematical statements can be 
expressed in the language of ZF set theory, and all theorems of classical mathematics 
can be proved in ZFC (ZFC = ZF augmented with the Axiom of Choice). 


Classes, Relationals, and Functionals 


We have seen that in ZF there is no set containing all sets. However, it is convenient 
to informally refer to such large collections as classes. For example, we will let V 
denote the collection of all sets, 


V := {x| x = x}, 


so that V is a class which is not a set. Another example is the class of all ordered 
pairs, conveniently denoted by V x V, which cannot be a set in ZF either (why?). 
A class such as V does not formally exist in ZF, and the term “class” really refers 
to a ZF property (i.e., a ZF formula with a specified free variable and possibly with 
additional parameters). For example, with x as the defining free variable, the class V 
above stands for the ZF property V = V(x) where V(x) is the ZF formula ‘x = x’, 
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and the class V x V can be replaced by the ZF formula ‘dydz(x = (y,z))’. Of 
course some ZF-formulas do define sets, such as the formula ‘x 4 x’ which defines 
the empty set @ and the formula ‘Vz(z ¢ x)’ which defines the set {0}. So some 
classes are sets. In fact, every set is a class since for any set a, the formula ‘x € a’ 
defines a. A class which is not a set, such as V or V x V, is called a proper class. 

As another example, the membership relation ‘e’, which is a subclass of V x V, 
is a relation which is not a set: 


€={(x,y)|xey} CVX. 


A relation which may or may not be set, i.e., a class consisting of ordered pairs 
alone, will be called a relational. Informally, a relational is simply any subclass 
of V x V. Formally, a relational refers to a ZF formula of the form g(x, y) 
with two specified free variables. Another familiar relational which is not a set is 
equinumerosity, ~, given by 


w= {(xy)| bel = [yh 


and it corresponds to the ZF formula “A f(f is a bijection from x onto y).” The 
domain and range of a relational are classes which may not in general be sets. For 
example, we have dom(€) = V and ran(€) = V~{@}. 

A relational F will be called a functional if xFy A xFy' > y = y’. Fora 
functional F,, we will use the functional notation F(x) to denote the unique y such 
that x Fy. Formally, a functional expression ‘y = F(x)’ is really a ZF formula of 
the form g(x, y) for which it can be proved in ZF that 


VxVyVy' (G(x. y)A g(x. y') > y=y') (“is many-one”). 


In fact, we have already been using functionals and functional notation in expres- 
sions like {x}, P(x), and Ux, which assign a unique set to every set x. These 
functionals all have domain V, but some functionals are not defined for all sets. For 
example, the functional zr. which assigns to every ordered pair its second coordinate 
(12((a, b)) = b for all a, b) has domain dom(z) = V x V. 


21.3. The Replacement Axiom 


The Axiom of Replacement (due to Fraenkel) says that if A is a set and F is any 
functional whose domain includes A, then the image F'[A] is also a set. In a slightly 
variant but equivalent form it says that for any set A and functional F we can form 
the set 


{F(x)| x € AN dom(F)}. 
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Formally, it is an axiom scheme (like separation): 


ZF 7 (Axiom Scheme of Replacement). For every ZF formula g = (x,y), we 
have the axiom: 


VxVyV2(o(x, y) A G(X, 2) > Y = 2) > VadbVuVv(u € a A g(u,v) > VED). 


We had already implicitly used the Axiom of Replacement in some constructions 
such as the Hartogs’ ordinal. Assume for the moment that we have defined ordinals 
and that every well-order X has a unique ordinal Ord(X). (This will be done in 
detail in the next section.) For any set A, we want to form the “Hartogs’ set” H(A) 
of ordinals as 


H(A) := {a| There is an injection from W(q) into A}, 


but in the above form it is not clear that this collection of ordinals will form a set. 
To address this issue, consider the set W,4 of all well-orders defined on subsets of 
A. Then since W4 C P(P(P(A))), so W4 can be shown to be a set by the power set 
and separation axioms. Since W, is a set, we can use Replacement to derive that 


H(A) = {Ord(R)| R € Wa} 


is a Set too. 
However, in most of ordinary mathematics, replacement is scarcely used. 


21.4 The von Neumann Ordinals 


In 1923, von Neumann [80] gave a remarkably simple, natural, absolute definition 
for ordinal numbers: 


The aim of this work is to present Cantor’s notion of ordinal numbers in a clear and concrete 
form. ... 

We actually take the proposition “Every ordinal is the type of the set of all ordinals 
preceding it” as the basis of our considerations. But we avoid the vague notion of 
“type” by expressing it as follows: “Every ordinal is the set of all ordinals preceding it.” 
... Accordingly (O is the empty set, (a, b,c,...) is the set with the elements a,b,c, ...): 


0=0, 
1= (0), 
2 = (0, (0)), 


3 = (O, (0), (O, (O))). 


= (O, (O), (O, (O)), (O, (O), (O, (O))),.--) 
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{Author’s translation of a part of [80, p. 199], done with permission from Acta Sci. Math. 
(Szeged).] 


We now present von Neumann’s results. 
Definition 1226 (Von Neumann Well-Orders). <A well-order X is said to be a 


von Neumann well-order if for every x € X, we have x = {y € X| y < x} (that is 
x is equal to the set Pred(x) consisting of its predecessors). 


Clearly the examples listed by von Neumann above, namely 
OD, {O}, {D.{D}}. (D{D {DiGi}, ... 


are all von Neumann well-orders if ordered by the membership relation “e,” and 
the process can be iterated through the transfinite. Our immediate goal is to show 
that these and only these are the von Neumann well-orders, with exactly one 
von Neumann well-order for each ordinal (order type of a well-order). This is called 
the existence and uniqueness result for the von Neumann well-orders. 


An immediate consequence of the definition of a von Neumann well-order is: 


Proposition 1227. Let X be a well-order. If X is a von Neumann well-order then 
the ordering relation on X coincides with the membership relation “€” restricted 
to X, that is, forallx,y € X wehavex<yoxey. 


Problem 1228. Show that the converse of the above proposition fails. 


It is also immediate that every proper initial segment of a von Neumann well-order 
X is amember of X and vice versa: 


Proposition 1229. For a von Neumann well-order X, the proper initial segments 
of X coincide with the members of X ; that is, Y is a proper initial segment of X if 
and only if Y € X. Thus, the set of proper initial segments of X equals X itself. 


Since a set cannot be equal to any of its proper subsets, we get: 
Corollary 1230. If X is a von Neumann well-order then X € X. 


Corollary 1231. Every member of a von Neumann well-order is a von Neumann 
well-order (under the membership relation). 


Corollary 1232. If X is a von Neumann well-order, then the ordering relation the 
relation on X coincides with relation of proper set containment for the members of 
X, that is, forallx,y € X wehavex<yoxCy. 


Problem 1233. Show that the converse of the above result fails. 


We now proceed to prove the existence and uniqueness results for von Neumann 
well-orders. 
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Uniqueness 


First, we have the following uniqueness theorem, which says that isomorphic 
von Neumann well-orders are identical: 


Theorem 1234. If X and Y are isomorphic von Neumann well-orders, then X=Y. 


Proof. Let f be an order isomorphism from X onto Y. We show by transfinite 
induction that f(x) = x for all x, thereby establishing the result. Suppose that 
x € X and that f(u) = u for all u < x. Then since f is an order isomorphism, we 
have 


fa) =tvlveVYav< fx) =tfWlue Xn fu) < fo} 
={f(u)|ue X Au<x} 
={ujueXAu<x}=x. Oo 


In other words, the uniqueness theorem says that every well-order is isomorphic to 
at most one von Neumann well-order. 
Uniqueness has some immediate consequences. 


Corollary 1235. If a von Neumann well-order X is isomorphic to a proper initial 
segment of a von Neumann well-order Y, then X € Y. 


Hence, using the comparability theorem for well-orders, we get: 


Corollary 1236 (Comparability). For any two von Neumann well-orders one 
must be an initial segment of the other. For any two distinct von Neumann well- 
orders one must be a member of the other. Thus if X and Y are von Neumann 
well-orders, then exactly one of X € Y, X = Y,andY € X holds. 


Corollary 1237. If X and Y are von Neumann well-orders, then the order-type of 
X is less than the order type of Y if and only if X € Y. 


Corollary 1238. Every von Neumann well-order X consists of all von Neumann 
well-orders having order type less than that of X. 


Corollary 1239. If X is a set of von Neumann well-orders and if for every member 
u of X any von Neumann well-order having order type less than that of u is also in 
X, then X is itself a von Neumann well-order. 


Definition 1240 (Successor of a Set). For any set X, we define the successor of 
X, denoted by Xt, as ¥* := X U {X}. 
Proposition 1241. If X is a von Neumann well-order of order type a, then X* is 


a von Neumann well-order of order type a + 1. 


Proof. Since X ¢ X, so with € as the ordering relation X U {X} becomes a strict 
linear order with greatest element X in which the set of €-predecessors of X is X 
itself. Oo 
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Corollary 1242. The union of any set C of von Neumann well-orders is a von Neu- 
mann well-order whose order type is the supremum of the order types of members 


of C. 


Proof. C isachain by comparability. Hence UC is linearly ordered by €. Regarding 
UC to be a linear order with € as the ordering relation, we see that each member 
of C is an initial segment of UC, and also each proper initial segment of UC is 
contained in a member of C. Hence UC is well-ordered by €, and its order type is 
the supremum of the order types of members of C. oO 


Existence 


We now prove the existence theorem, whose proof uses Replacement in an essential 
l 


way. 
Theorem 1243. For each well-order X there is a (unique) von Neumman well- 
order which is isomorphic to X. 


Proof. The proof is by transfinite induction on well-orders. Suppose that X is 
well-order such that every proper initial segment of X is isomorphic to some 
von Neumann well-order. We show that then X itself is isomorphic to some 
von Neumann well-order. 

For every x € X there is a unique von Neumann well-order Y, isomorphic to 
Pred (x) (by induction hypothesis and by uniqueness). By Replacement, we can 
form the set 


C := {¥,|x € X}. 


Note that for any x, y € X we have x < y if and only if order type of Y, is less 
than the order type of Y,, which holds if and only if Y; € Y,. Hence the mapping 
x +> Y, is an order-isomorphism from X onto (C, €). Therefore C is well-ordered 
by € with every member of C equal to the set of its predecessors, and hence C must 
be a von Neumann well-order isomorphic to X. Oo 


New Definition of Ordinal Number 


By the existence and uniqueness theorems, for every well-order X there is a unique 
von Neumann well-order isomorphic to it, and two well-orders are isomorphic if 


'Rarlier than von Neumann, others (Zermelo, Mirimanoff, etc) had partially developed similar 
ideas, but the results were limited as the general existence theorem could not be proved due to the 
lack of the Replacement axiom. 
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and only their associated von Neumann well-orders are identical. We thus have a 
complete invariant for the equivalence relation of order-isomorphism between well- 
orders, which provides a remarkably simple, elegant, and concrete definition for 
ordinal numbers: 


Definition 1244 (Von Neumann Ordinals). For each well-order X, we define the 
von Neumann ordinal of X, or simply the ordinal of X, denoted by Ord(X), to be 
the unique von Neumann well-order isomorphic to X. 

a@ is called an ordinal if a is the ordinal of some well-order, i.e. if aw is a 
von Neumann well-order. The class of all ordinals will be denoted by On. 


Von Neumann’s definition of ordinal numbers has become the standard in axiomatic 
set theory, and forms the backbone of the universe of sets and the theory of the 
transfinite. From now on we will follow the above definition and use the term 
“ordinal” to mean a von Neumann well-order. 

Since any ordinal is equal to the set of smaller ordinals, the older set W(a) = 
{B| B < a} becomes identical to a itself: 


W(a) = {B| B <a} = {B| Bea} =a. 


We can therefore dispense with the notation “W(q),” replacing it with the simpler a. 

By the Burali-Forti paradox, the class On is not a set and does not exist formally 
in ZF, and the expression “x € On” really stands for the ZF formula “x is an 
ordinal.” However, if a subclass A of On is a proper initial segment of On then 
A is a set, since A then equals the least ordinal not in A. 


Definition 1245 (Zero, Successor, and Limit Ordinals). We define the smallest 
ordinal zero as 0 := @. An ordinal « is a successor ordinal if a = B* for some 
ordinal 8. An ordinal which is neither zero nor a successor is a limit ordinal. 


Definition 1246. We say that x is an initial set of ordinals if every member of x is 
an ordinal and for all y € x, ifz < y thenze x. 


Problem 1247. x is an ordinal if and only if x is an initial set of ordinals. 


Theorem 1248 (The Principle of Transfinite Recursion). For each functional G 
with domain V, there exists a unique functional F with domain On such that for 
every a € On, 


F(a) =G((F(B)|B<a@)) (i.e, F(a) = G(F fa). 


Problem 1249. Prove Theorem 1248 in ZF. 


[Hint: The proof is the same as that of Theorem 622, but note the following. 
Formally, one cannot quantify over classes in ZF, so phrases like “for each functional 
G” and “there exists a functional F” are not officially allowed. So Theorem 1248 is 
really a theorem scheme: For each class G (really a formula), another class (formula) 
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F can be formed (in terms of G) for which the corresponding instance of the 
theorem can be proved in ZF. 

However, there is little danger in informally treating relationals and functionals 
as ordinary sets as long as one is careful about them. (Functionals defined on proper 
initial segments of On are really sets, so they can be quantified over at will.) Thus it 
is alright to pattern the proof as follows: 


Given G: V — V, define F to be the class of those ordered pairs (x,y) such that there 
is an ordinal @ and a function / with dom(h) = a, x € dom(h), h(x) = y, and which 
satisfies h(B) = G(h } B) for all B <a.... 


The rest of the proof, which consists of showing that F is a functional satisfying the 
theorem, is identical to that of Theorem 622.] 


Transitive Sets 


Definition 1250. A set x is called transitive if for all y,z,z€ yA VY ExX>ZEX. 


Problem 1251. Show that a set x is transitive if and only if every element of x is a 
subset of x if and only if Ux C x. 


Problem 1252. Show that the empty set is transitive and that for any set x, if x is 
transitive then so are x* = x U {x}, P(x), and Ux. 


Problem 1253. Show that every ordinal is transitive. 


The following problem characterizes von Neumann ordinals as transitive sets well- 
ordered by the membership relation €, and thus in some treatments it is taken as the 
definition for von Neumann ordinals. 


Problem 1254 (Characterization of Von Neumann Ordinals). Show that x is 
an ordinal if and only if x is transitive and x is well-ordered by the membership 
relation €. 


21.5 Finite Ordinals and the Axiom of Infinity 


Finite Ordinals 


The smallest von Neumann well-order is 0 = @, and repeatedly applying the 
successor operation we get: 


0:=@ 
1:=0U {0} = {@ 
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2:=1U {1} = {0,1} = {@, {Oh}, 
3 := 2U {2} = {0,1,2} = {O, {O}, {O, {O}}}, etc. 


The above are the finite von Neumann well-orders. Formally, we have 


Definition 1255 (Finite and Infinite Ordinals). 7 is a finite ordinal if either n = 0 
or 71 is a successor ordinal and every nonzero element of n is a successor ordinal. 
An infinite ordinal is an ordinal which is not finite. 


Problem 1256. 0 is a finite ordinal. If x is a finite ordinal then so is x*. 


Problem 1257. [fi is a finite ordinal and m € n then m is a finite ordinal. Thus 
any initial segment of a finite ordinal is a finite ordinal. 


Problem 1258. [fn is a nonzero finite ordinal thenn = m* for some finite ordinal 
men. 


The existence of infinite ordinals cannot be proved yet, since the collection of all 
finite ordinals is not known to be a set—at this point it is only a class. But we still 
have: 


Problem 1259. Any infinite ordinal contains all finite ordinals as members. 


The class of finite ordinals together with the successor operation can now be verified 
to satisfy the Dedekind—Peano axioms, including the principle of finite induction. 


Theorem 1260 (The Principle of Finite Induction). Let P be a ZF prop- 
erty for which we have P(0) and Vn(P(n) — P(nt)). Then we have 
Vn(u is a finite ordinal > P(n)). 


Proof. Assume the hypothesis, and to derive a contradiction, suppose that 7 is a 
finite ordinal for which we have —P(n). Then n 4 0 by hypothesis, and son = 
mt? for some finite ordinal m € n. Now we must have —P(m) since otherwise by 
hypothesis we would have P(n). Hence the set {k € n | —P(k)} is nonempty and 
must have a least element g. Since P(0) is true, we have gq 4 0, and so, since q is 
a nonzero finite ordinal, g = r* for some finite ordinal r € g. We now must have 
P(r) by minimality of g, which implies P(q) is true, a contradiction. Oo 


Therefore, we can define operations on the finite ordinals such as addition and 
multiplication, and the entire theory of “Peano Arithmetic” can be developed based 
on the finite ordinals. 

We kept our development of finitude in ZF independent of the one given in 
Sect. 5.3, but they are really equivalent in the following sense. 


Problem 1261. Call a set x inductive if for any y € P(P(x)) the two conditions 
Oe yandVze yWue x(zU {u} € y) imply x € y. Show that a set x is inductive 
if and only if x = n for some (unique) finite ordinal n. 
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We now want to get w, the supremum of all the finite ordinals, which, as an ordinal, 
must equal to the set of all smaller ordinals. In other words, w must consist of all 
finite ordinals, as in: 


w := {0,1,2,...}. 


But how can we justify that this exists as a set and is not a proper class? More 
precisely, how can it be proved that there is a set consisting precisely of the finite 
ordinals? 

To discuss this question, let “x = w” stand for “x is the set of all finite ordinals,” 
or more precisely for the ZF formula ‘Vy(y € x < y isa finite ordinal)’, and let 
“qw exists” stand for the ZF formula ‘Ax (x = w)’. 

Note that to show that w exists, it suffices to show that there is some set D 
containing all finite ordinals, since then by Separation we can getw = {y € 
b | y isa finite ordinal}. By the principle of finite induction, any set b satisfying 
0€¢ bAVWn(n € b > nt € J) will contain all finite ordinals. Moreover, since 
every infinite ordinal contains all finite ordinals, the existence of wm can be seen to 
be equivalent to the existence of an infinite ordinal. These equivalences do not need 
the replacement axiom, and so we have: 


Proposition 1262. From the ZF axioms introduced so far one can derive, without 
using the replacement axiom, that the following conditions are equivalent to each 
other: 


1. @ exists. 
2. There is a set b such that0 €b AVn(n € b > nt €b). 
3. There is an infinite ordinal. 


It turns out that none of the statements above (in particular the existence of w) 
can be derived from the axioms introduced so far. Zermelo, in his 1908 paper [85] 
introduced the Axiom of Infinity precisely for this purpose.” 


ZF 8 (Axiom of Infinity). @ exists, or equivalently, there is an infinite ordinal, or 
equivalently Ib(0 <b AVn(n € b > n* € b)). 


We will now show, using Replacement, that the Axiom of Infinity is equivalent to 
the existence of an infinite set, where by an infinite set we mean a Dedekind infinite 
(reflexive) set. 

First, since the mapping n +> n* maps @ into a proper subset of w, so the 
Axiom of Infinity implies (even without Replacement) that there is a Dedekind 
infinite (reflexive) set. 


?Zermelo used the operation x +> {x} instead of the successor operation, and his version of the 
axiom stated that there is a set b such that @ € b, and {x} € b for every x € b. 
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To get the converse, we will use Replacement. If we carefully examine the proof 
of the existence theorem, we see that the existence of an ordinal a was obtained 
from the existence of some well-order of type a, through the use of the replacement 
axiom. Thus the existence of @ will follow from the existence of a well-order of 
type w, and the existence of an infinite ordinal will follow from the existence of 
an infinite well-order. Now, earlier we had seen that the existence of a Dedekind 
infinite (reflexive) set is equivalent to the existence of an infinite well-order, hence 
under replacement all our conditions become equivalent. 


Theorem 1263. From the ZF axioms introduced prior to Infinity (thus including 
Replacement), one can derive that the following are equivalent: 


1. The Axiom of Infinity, that is @ exists, or equivalently that there is an infinite 
ordinal. 

2. There is an infinite well-order. 

3. There is a Dedekind infinite (reflexive) set. 


Problem 1264. Prove, based on the ZF axioms introduced so far, that the ordinal 
o+o exists. 


The Replacement axiom, together with the Axiom of Infinity, guarantees access to 
the transfinite. All the infinite cardinals and ordinals that we had studied earlier can 
be shown to exist using Replacement. 


21.6 Cardinal Numbers and the Transfinite 


In addition to providing a canonical representative for every well-order, the 
von Neumann ordinals can give a complete invariant for the equivalence relation of 
equinumerosity so long as the Axiom of Choice is assumed. 


Definition 1265 (Equinumerosity, Initial Ordinals). We write a ~ b to denote a 
is equinumerous (bijective) with b. An ordinal q@ is said to be an initial ordinal if 
there is no 6 <a such that B ~ a. 


All finite ordinals are initial ordinals. The first few transfinite initial ordinals are 
@,@1,02,.... 

By AC, every set can be well-ordered, and so must be equinumerous to some 
ordinal, and therefore also to some initial ordinal. Also, in Cantor’s original 
conception of the transfinite (which implicitly assumed AC so that every infinite 
cardinal was an aleph), cardinals correspond naturally and uniquely with the initial 
ordinals. We thus obtain the classic Cantor-Von Neumann definition of cardinal 
numbers. 


Definition 1266 (Cantor-Von Neumann Cardinals (AC)). For any set x, define 
|x|, the cardinality of x, to be the least ordinal a such that a ~ x. 
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We say that « is a cardinal if k = |x| for some set x. 
Thus in the Cantor—-Von Neumann definition, every cardinal is an ordinal. 
Problem 1267. « is an initial ordinal if and only if a is a cardinal. 


The Cantor—Von Neumann definition is readily seen to satisfy the condition 
Ix]}=|lyloxry, 


and so we have an adequate definition of cardinal numbers under AC. In particular, 
we have Xo = w, Ni = @1, etc., but the ordinal |R| cannot be effectively determined 
because of the independence of the continuum hypothesis even under AC. 


The Cumulative Hierarchy 


We had earlier defined the sets V,, for finite ordinals n, where 
Y=@ and V,+=P(V,). 


Now, by Infinity, @ exists, and so by Replacement and Union, we can define 


Vora Vis 


n<wW 


Since each V,, is finite for € w, so V,, is countably infinite. We can next define 


Vo+i = P(Vo), 
Vot2 = P(Vo41) 


so that V4, has cardinality ¢ = 2Ro, Vo42 has cardinality 2°, and so on. 


Definition 1268 (The Cumulative Hierarchy). Define the cumulative hierarchy 
of sets Vy, a@ € On, by transfinite recursion on @ as: 


Vo = @, 
Vor = P(V,), and 


Va = (J Vp if a is a limit ordinal. 


B<a 


The sets V, increase with the ordinal q, that is, 


Weve eV as 


21.6 Cardinal Numbers and the Transfinite 387 


In fact, we have: 


Problem 1269. For any ordinal a, Vy [& Vo+i, and 


Va =|) Vena 


B<a 


Problem 1270. x ¢ V, if and only if x © Vp for some B < a. 
Problem 1271. Ifa is an ordinal, then a € Vy41~Vzy. 


Problem 1272. Show that V, is transitive for each ordinal a. 


Set Theory Without Replacement 


Definition 1273 (Zermelo Set Theory, Z). The axiom system consisting of all the 
ZF axioms mentioned so far, including Infinity, but without Replacement, is known 
as Zermelo Set Theory and is denoted by Z. 


The axiom system Z, introduced by Zermelo* in 1908 [85], was the first formulation 
of axiomatic set theory in the modern sense. As we will indicate below, Z has 
sufficient power for the development of almost all of the ordinary mathematics. 


Problem 1274. Prove the following in Z. Given any sets A and B, the set A x B 
(cartesian product) and the set B4 of all functions from A to B both exist. Also, 
given any set A, the set A<® of all finite sequences from A exists. (A finite sequence 
from A is a function whose domain is a finite ordinal and whose range is contained 


in A.) 


{Hint: By Union, Power Set, and Separation, note that A x B C P(P(A U B)), 
B4 C P(A x B), and A<® C P(w x A).] 


Developing Mathematics in Z 


By the Axiom of Infinity, the von Neumann ordinal w exists in Z. So we can iterate 
the power set operation and get the existence of each of the following sets in Z: 
P(w), P(P(w)), ..., P”(w), ..., where we define P°(@) = w and P"t!(w) := 
P(P"(w)). Also the sets @ x w, P(wxw), w”, all exist as well (with a xw C P?(w), 
w® C P?(a), etc.). This allows the construction of the positive rationals (ratios) as 


3As mentioned earlier, Zermelo’s original system used the mapping x +> {x} instead of the 
successor operation in its formulation of the Axiom of Infinity, and the infinite set whose existence 
was asserted was {O, {O}, {{O}},...}. In this respect our system Z is strictly distinct from 
Zermelo’s original; see Kunen [44, p.125], Exercise II.4.21. 
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ordered pairs from w, so the set of rational numbers Q is in P” (a) for some finite n. 
Therefore the set R of real numbers (defined as Dedekind cuts of rationals numbers 
so that R C P(Q)) is also in P”(w) for some finite n. So P(R x R), i.e., the set of 
all relations on R is a set in Z. Hence also the set of all real functions is a set whose 
existence can be established in Z. Complex numbers and functions can evidently be 
constructed too. In this way, all common mathematical objects can be seen to be 
present in P”(@) for some finite n, and almost all of ordinary mathematics can be 
carried out in Z. 


Limitations on the Ordinals and the Transfinite in Z 


Since the ordinal w exists in Z, so do the ordinals wm + 1,w+2,...,w +7N,..., 
by repeatedly applying the successor operation. But, due to lack of Replacement, 
we cannot prove in Z that the ordinal w + w exists. In Z, the sum of two ordinals 
may fail to exist! On the other hand, the “sum of well-orders” exists in the following 
sense. 


Problem 1275. Prove in Z that if a and b are well-orders then there is a well-order 
c which can be partitioned into an initial segment u and a final segment v such that 
a is order isomorphic to u and b is isomorphic to v. 


Thus in Z one can prove that there exist well-orders of type w + w, although the 
von Neumann ordinal w + @ does not exist and in fact the only ordinals which can 
be shown to exist in Z are the ones less than w + w. This means that the existence 
theorem (that every well-order is isomorphic to some von Neumann ordinal) fails 
in Z: Although the definition of von Neumann ordinals does not need Replacement, 
the proof the existence theorem needs Replacement in an essential way. 

Note that uncountable well-orders can be shown to exist in Z. 


Problem 1276. Construct a well-order of type @, in Z. 


Thus even though most countable ordinals (those > wm + @) are not available 
in Z, we can fix a well-order of type @, and take its proper initial segments to 
serve as representatives for the countable ordinals in Z. Unless we insist on the 
“absolutist” approach of von Neumann ordinals, all countable ordinals exist in Z in 
this “structuralist” sense used in ordinary mathematics. 

More generally, the Hartogs construction can be done in Z: 


Problem 1277. For any set A, there is a well-order X such that every proper initial 
segment of X, but not X itself, is equinumerous to some subset of A. 


By induction, well-orders of type w, (and so sets of cardinality &,,) exist for all 
n € w. Thus, in spite of the lack of von Neumann ordinals > w + , representative 
well-orders for every ordinal @ < @, are available in Z. For cardinals, put 2, := 
| P”(w)|, and note that for any cardinal « < 1, (n € w), sets of cardinality « exist 
in Z. 


21.6 Cardinal Numbers and the Transfinite 389 


However, this is where the development of the transfinite stops in Z. No well- 
order of type @, or set of cardinality &,, can be shown to exist in Z. In fact, it can 
be shown that V,,4,, works as a “set-theoretic universe” or “model” for Z. 


Other Limitations in Mathematics Without Replacement 


Another problem due to lack of Replacement appears when defining functions on 
o (or on N) by recursion, i.e., where one defines a term F(n) forn € q. If all the 
values F(n) of the function F belong to a set B which already exists in Z, then 
F becomes a subset of w x B, and so by separation F' exists in Z as an ordinary 
function (i.e., a set whose existence can be established in Z). If, however, the values 
of F are not restricted to be within such a predetermined set, then in general F will 
only be a functional, not necessarily a function; and its range will be a class, not 
necessarily a set. For example, the functional n +» w +n (n € w) cannot be proved 
to be a function in Z. (More generally, this problem affects definition by transfinite 
recursion.) 

Such constructions are common in ordinary mathematics (e.g., forming algebraic 
closures of fields, infinite direct products of groups, etc.) where one defines 
(recursively) sets Ap, Aj, A2,..., and in the end needs to combine them somehow, 
e.g., form the union U,, A,,. In general, this cannot be done in Z, unless the sets A, 
are all subsets of a fixed set already known to exist in Z. For example, if we let 
A, = @ +n, then, as mentioned before, the union U, A, = @ + @ is aclass in Z 
which cannot be shown to exist as a set in Z. 

However, in most ordinary cases, this problem can be addressed as follows: 
It is usually possible to find suitable isomorphic copies of the sets A, within a 
predetermined set known to exist in Z, and work with these copies instead. In the 
structuralist approach of ordinary mathematics, such isomorphic replacements for 
the sets A, are acceptable. For example, in Z one can find well-ordered subsets X;, 
of Q where each X,, is an initial segment of X,,4; and X,, is order isomorphic to 
@ +n, so that their union U,, X,, has type @ + o. 

There is one scenario in which this approach does not work: When the sets A, 
grow bigger in cardinality without bounds, there may be no set in Z which has room 
to fit them (or their copies) all in. This is the case, e.g., if we put A, = P"(a), 
since for any set E which contains copies of A, we must have |E| > 1, for all 
n, and such an E cannot be guaranteed to exist in Z. But such situations are rare* 
in ordinary mathematics. For most of ordinary mathematics, one does not need the 
Axiom of Replacement and Zermelo’s system Z turns out to be quite adequate. 


4There are a few theorems of mathematics, such as the determinacy of Borel games (proved by 
Martin), where such situations are indeed encountered and the use of the Axiom of Replacement 
becomes necessary. 
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Definition 1278. For a relation R on X, we say that (X, R) has the von Neumann 
property if the set of R-predecessors in X of any element of x € X coincides with 
x, that is if for any x € X, we have (Vy)(y Ex @ ye X A yRx). 


Problem 1279. Let R be a relation on X. Then (X,R) has the von Neumann 
property if and only if X is a transitive set and R is the membership relation 
restricted to X. 


Thus the membership relation restricted to a set X has the von Neumann property if 
and only if X is transitive. 


Extensional Well-Founded Relations 


A relation R on a set X is called extensional if distinct elements of X have 
distinct sets of R-predecessor, that is if for alla,b € X, (Vx)(xRa & xRb) > 
a = b). (Thus the axiom of extensionality says that the set membership relation 
€ defined on the class of all sets is extensional.) For well-founded extensional 
relations, a generalization of the existence theorem for well-orders (representation 
by von Neumann ordinals) holds. This result, due to Mostowski, states: If R is a 
relation on X which is well-founded and extensional on X, then there exists a unique 
transitive set M such that (X, R) is isomorphic to (M,€m), where €y is the set 
membership relation restricted to M. 


Problem 1280. Prove Mostowski’s Theorem as stated above. 


{Hint: Recall the uniqueness-existence proofs for von Neumann well-orders. ] 


Transitive Closures 


Given any set x, its transitive closure is formed by collecting together the members 
of x along with the members of members of x, the members of members of members 
of x, and so on. In the language of Chap. 10 Sect. 11.4, the transitive closure of x is 
the “e-ancestry” of x. In other words, y is in the transitive closure of x if there is a 
positive integer m and yo, y1,..., ¥n Such that yo = y, Wy, = X, and ye © Ye41 for 
k =0,1,...,n —1. This can be formalized in ZF as follows. 


Definition 1281 (Transitive Closure). 


y €te(x) o Ifdn(n €wAn £0A f isa function A dom(f) = n* 
AfO=VA fn) =XAVK(K <n f(k) € fk + 1)). 
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We then have 


Proposition 1282. One can deduce the following results using the ZF axioms 
introduced so far 


1. te(x) is transitive and is the smallest transitive set containing x as a subset. 

2. te(te(x)) = te(x). 

3. te(x) U {x} = te(xT) is transitive and is the least transitive set containing x as 
a member. 

4, te(x) = x U(Ux) U(UU x) U--. 

6.162) We, tole): 


Regular Sets 


Recall that we saw that under AC, a relation R is well-founded if and only if there 
is no descending infinite R-chain, that is there is no infinite sequence (x,,) such that 
Xn4+1RX, for all n. In particular, if R is well-founded then R is irreflexive and there 
are no R-cycles such as xRy and yRx, or xRy, yRz, and zRx. 

In the context of sets, we will say that x contains a descending €-chain if there 
is an infinite sequence (x,,) such that 


“= E Xn © hy Geo E.X3 EXE X1 SX; 


If x contained a descending chain as above, then all the elements x, are in 
the transitive closure of x, and so tc(x) would not be well-founded under the 
membership relation €. 


Definition 1283 (Regular Sets). x is regular if tc(x) is well-founded under the 
membership relation €. 


A regular set is often called a well-founded set. 
Under AC, a set is regular if and only if it does not contain a descending €-chain, 
but even without AC a regular set cannot contain a descending €-chain. 


Problem 1284. Every ordinal is regular. 


In addition to the ordinals, all sets we have encountered so far were regular. 
Regularity of a set prevents it from satisfying unusual conditions such as self- 
membership. In particular, if x is regular then the singleton {x} will be distinct 
from x. Similarly, we cannot have “e-cycles” such as x € yAy € xX, or 
xXxE€VAYEZAZEX, if any of the sets in the cycle is regular. 


5It can be shown that the existence of non-regular sets cannot be demonstrated on the basis of the 
ZF axioms introduced so far, unless those axioms are contradictory. 
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Since tc(tc(x)) = tc(x), we see that x is regular if and only if tc(x) is regular. 
Moreover, we have: 


Problem 1285. For any set x, 


1. x is regular if and only if every member y € x is regular. 
2. If x is regular then any subset of x is regular. 
3. If x is regular then P(x) is regular. 


Problem 1286. [fx € Vq for some ordinal a then x is regular. 


Ranks 


Recall that every well-founded relation R on a set A has a unique rank. If x is 
regular, so that tc(x) is well-founded under €, we define the rank of x to be the 
rank of the well-founded structure (tc(x), €) (we can think of tc(x) providing the 
structure by which x is “built up” from @). 


Definition 1287 (Rank of Sets). For any regular x, the rank of x, denoted by 
rnk(x), is the rank of the well-founded structure (tc(x), €). 


From the results of Chap. 10 Sect. 11.4, we have the following result. 
Problem 1288. [f x be regular, then 


rnk(x) = sup{rnk(y) + 1| y € x}. 


Corollary 1289. [fx is regular and y € x then rnk(y) < rnk(x). 


The following theorem establishes the link between ranks and the levels of the 
cumulative hierarchy. 


Theorem 1290. x € V, if and only if x is regular and rnk(x) < a. 


Proof. We use transfinite induction on w. Suppose that the result holds for all B < a. 

First, let x € V,. Then x C Vg for some 6 < a, so y € Vz forall y € x, so by 
induction hypothesis we get: y is regular and rnk(y) < f for all y € x. Since every 
member of x is regular, so x is regular, and since rnk(y) < f forall y € x, we get 


mk(x) = suprnk(y) + 1 < 6 <a. 
yex 


Conversely, suppose that x is regular and rnk(x) = 6 < a. Hence for every 
y €x, y is regular and mk(y) < mk(x) = 8, so by induction hypothesis y € Vg 
for all y € x, which implies x C Vg, and so x € P(Vg) = Vg4i C Vo. oO 


By the following, the regular sets are exactly the ones that can be obtained through 
the progression of the cumulative hierarchy. 
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Corollary 1291. x is regular with rnk(x) = @ if and only if x € Vy41~Vo. 
Hence for regular x, mnk(x) is the least ordinal jt such that x € V,41. 
Moreover, x is regular if and only if x € Vy for some ordinal a. 


21.8 Foundation and the Set Theoretic Universe V 


As we have mentioned before, non-regular sets (sets with descending €-chain) not 
only are unnecessary for the development of mathematics, but such sets also are not 
naturally encountered since the existence of non-regular sets cannot be derived from 
the ZF axioms we have mentioned so far. The last axiom of ZF is used to explicitly 
rule out such extraneous sets. 


ZF 9 (Axiom of Foundation). Every set is regular, or equivalently every set 
belongs to Vy for some ordinal a. 


Problem 1292. The Axiom of Foundation is equivalent to the statement that every 
nonempty set x has a member which is disjoint from x. 


The following problems need the Axiom of Foundation. 


Problem 1293. x is an ordinal if and only if x is transitive and linearly ordered by 
the membership relation €. 


Problem 1294. A set x is said to hereditarily have a property P if x and all 
members of tc(x) have property P. Show that x is hereditarily transitive if and 
only if x is an ordinal and x is hereditarily finite if and only if x € Vy. 


Recall that V := {x | x = x} denotes the class of all sets, also called the 
set theoretic universe (we had seen that V cannot be a set). Then the Axiom of 
Foundation can be stated in the following suggestive form: 


V= es 


a@e€On 


Since the sets V, increase with a, the above equation shows that the universe V of 
all sets is arranged in a hierarchy of sets of ever increasing ranks. Foundation thus 
enables us to prove that a property is true of all sets using the convenient method of 
transfinite induction on the rank of a set. 

Note that the complement of any set is a class which will contain all sets of 
sufficiently large rank. Thus in ZF set theory each set is a miniscule infinitesimal 
part of the universe, and the universe V of all sets is truly large compared to any 
particular set. 
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The Universe V 


Problem 1295 (ZFC). For each cardinal x, put H(k) := {x| | tc(x)| < «}. 


1. H(&o) = Va (which is countable). 

. H(Ry) SC Va,, So H(k) is a set. 

. The set of hereditarily countable sets has cardinality ¢: |H(®,)| = 2®°. 

. There is a natural surjection from the set W of well-founded trees over w onto 
H(&1) preserving rank (between trees and sets). 


Kw N 


Defining Cardinals Without Choice 


Although V is not a set in ZF, V, is a set for each ordinal a. This is useful in 
effectively extracting a nonempty subset from any large collection which may not be 
a set. In particular, we have the method of Scott, in which the Axiom of Foundation 
is used to define the notion of cardinal number for arbitrary sets without using the 
Axiom of Choice. In the Frege—Russell definition, the cardinal number |a| of a set a 
is the equivalence class of a under the equinumerosity relation. Thus |a|, consisting 
of all sets equinumerous to a, is a problematic large collection which is not a set. 


Problem 1296 (ZF). For any nonempty set a, there is no set which contains all sets 
equinumerous with a. 


Scott’s method allows us to effectively select a nonempty subcollection of the 
equivalence class which is small enough to be a set. 


Definition 1297 (The Frege—Russell-Scott Cardinal). For any set x, let |x|~ be 
the collection of all sets of least rank which are equinumerous to x: 


Ix] := {y| y & XA Vez & x > mk(y) < rnk(z))}. 


Problem 1298. For any x, |x|~ © Vinkoy+1 and so |x|~ is a set. 
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Moreover, the functional x +> |x|~ is a complete invariant for equinumerosity 
(Ix|~ = |y|" << x & y for all sets x, y), and therefore is a satisfactory definition 
of “cardinal number.”® 


More generally, for any relational R, one can define the Frege—Russell—Scott 
invariant for R as follows. 


Definition 1299 (The Frege—Russell—-Scott Invariant). If R is a relational and x 
is any set, let [x]; be the collection of all sets y of least rank for which yRx holds: 


[Ix]k = {9 | yRx A V2(zRx > rnk(y) < rnk(z))}. 


Problem 1300. Let = be an equivalence relational on a class C. Then [x]Z is a set 
for all x, and the Frege—Russell—Scott invariant mapping 


xb [x]Z (x €C) 


acts as a complete invariant for the equivalence relational =, that is, for all x,y € 
C we have x = y if and only if [x]Z = [y]EZ. 


The above Frege—Russell—Scott method works for any equivalence relational 
whatsoever, but is of particular relevance when some of the equivalence classes 
are too large to be sets. For example, we can conveniently define the order type 
OrdTyp(X) of an arbitrary order X using the Frege—Russell—Scott invariant by 
taking OrdTyp(X) := [X]=, where & denotes similarity (isomorphism) of orders. 


21.9 Other Formalizations of Set Theory 


Other than the Zermelo—Fraenkel system, there have been several other major 
axiomatizations of set theory which use the formal framework of first order logic. 
We briefly discuss a few of them. 


Von Neumann-Bernays Set Theory 


This popular axiomatization of set theory was initiated by von Neumann in 1925 
[78, pp. 393-413], and later developed extensively by Bernays (see [3]). We will 
abbreviate the name “Von Neumann—Bernays Set Theory” as VNB. Further work on 


6One can even use a “hybrid method” to define cardinal numbers without Choice, where 
|x| is defined as the least ordinal equinumerous to x if x can be well-ordered, and as the 
Frege—Russell—Scott cardinal of x otherwise [48]. Under Choice, this definition reduces to the 
Cantor-von Neumann definition. 
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VNB was done by Robinson and Gédel, and VNB is often called the Von Neumann— 
Bernays—Gédel Set Theory (abbreviated as NBG). 

In ZF, we had used the term “class” to informally talk about large collections 
which were not sets, but classes did not exist as formal objects in ZF. A key 
feature of VNB is that it formally allows talking about large collections such as 
the collection V of all sets—which are now allowed as objects that formally exist. 
Since such collections cannot be sets, VNB uses the formal term class for general 
collections, and the system is developed as a theory of classes (subcollections are 
called subclasses, e.g.). In VNB, every object is a class, and so all sets are classes, 
but there are classes which are too large to be sets, such as the class V of all sets 
and the class On of all ordinals, which are called proper classes. A proper class 
cannot be a member of any class. A set is defined as a class which can be member 
of some class. Thus VNB divides classes into two distinct and exclusive sorts, sets 
and proper classes. 

The notion of class in VNB is a formal way of representing Zermelo’s vague 
notion of “definite property,’ and is similar to Skolem’s use of the formal notion 
of a first-order formula in place of a “property.” VNB has a “class comprehension 
scheme” for forming new classes, but the quantifiers in the formulas used in the 
formation of classes cannot range over arbitrary classes—they are limited to range 
over sets. Using classes, the axiom of replacement can be stated simply as follows: 
If F is a class which is a function and A is a set then the image F[A] is a set. 
Note that in ZF, replacement was an axiom scheme—an infinite list of individual 
axioms (instances of the scheme), but in VNB it is a single axiom. This brings us 
to another important aspect of VNB: It is an axiom system which turns out to be 
finitely axiomatizable, that is, it is possible to find a finite list of individual axioms 
(not schemes) which will axiomatize VNB. ZF cannot be axiomatized with a finite 
set of axioms (unless ZF turns out to be inconsistent). 

In spite of its appearance to be a more extensive theory than ZF —allowing a 
larger collection of objects that it can formally talk about—VNB turns out to be 
essentially equivalent to ZF in the following sense: Any statement mentioning only 
sets (no classes) that can be proved in VNB can be proved in ZF, and vice versa. 
In other words, VNB cannot prove any new fact about sets that ZF cannot already 
prove (which is technically stated by saying that VNB is a conservative extension of 
ZF). In particular, this implies that ZF and VNB are equiconsistent theories: if one 
of them is consistent, then so must be the other. 

More on VNB can be found in the references, such as Bernays [3], which has a 
detailed presentation of VNB. 

Another system due to Morse and Kelley, known as Morse—Kelley Set Theory or 
MK, was first introduced in 1955 as an appendix to Kelley’s book on topology [40]. 
MK is closely related to VNB, but it allows quantification over arbitrary classes in 
its class comprehension scheme, resulting in a system which is strictly stronger than 
ZF and VNB (assuming ZF is consistent). 
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Quine’s New Foundations 


From the systems of set theory discussed so far (Type-Theory/PM, ZF, VNB, MK), 
one may think that in order to avoid contradictions, a formal system should never 
allow the collection of all sets to be a set itself—an idea expressed by Russell 
as the vicious circle principle and famously paraphrased by Halmos as “nothing 
contains everything.” Yet, in 1937, Quine [61] introduced an axiomatization of set 
theory called New Foundations, or NF, which has a set containing all sets (and thus 
containing itself)! In this and several other respects, NF is a formal system strikingly 
different from ZF and VNB. 

NF allows the set of all sets (or the class of all classes): The entire universe 
V := {x| x = x} is itself a set, and we have V € V in NF. Every set A has a global 
complement V~A := {x| x ¢ A}, and so with V as the universal set, the collection 
P(V) of all subsets of V becomes a Boolean algebra of sets. 

Like Frege’s original system NF has only two axioms: Extensionality and 
Comprehension. The language of NF is identical to that of ZF—any NF formula 
is a ZF-formula and vice versa. Also, The axiom of extensionality says, as usual, 
x=yHVezexozey). 

Instead of building sets from bottom up via mechanisms such as the power set 
operation, NF forms sets only via applications of the comprehension scheme, just 
as in Frege’s original (inconsistent) system. For example, for any set A, its power 
set is formed as P(A) := {x | Vy(v € x = y € A)}, which exists by NF- 
comprehension—no special power set axiom is needed. 

However, to avoid contradictions, NF puts a restriction on the syntax of the 
formulas that can be used in its comprehension scheme: In order to form the set 
{x | @(x)} via comprehension, the formula ¢ = ¢(x) must be “stratified”’ An 
NF-formula ¢ is said to be stratified if there is a mapping f from the instances of 
variable letters occurring in ¢ to N such that for any substring of ¢ having the form 
“x € y” we have f(y) = f(x) + 1 and for any substring of ¢ having the form 
“x = y” we have f(x) = f(y). This avoids forming sets such as Russell’s set 
{x| a(x € x)}, since the formula “x € x” is not stratified. 

In NF, cardinal and ordinal numbers are defined using the original Frege—Russell 
global invariant: The cardinal number of a set A is the set of all sets equinumerous 
to A, and the order-type number of an order X is the set of all orders isomorphic 
to X. Unlike ZF, there is no problem in NF of these notions being “too large to be 
sets.” 

On the other hand, NF sharply deviates from classical “Cantorian” set theory in 
several ways. For example, Cantor’s Theorem |X| < | P(X)| cannot be proved in 
NF in full generality, and NF refutes the axiom of choice. Some of NF’s oddities 
are addressed in a newer theory due to Jensen known as NFU, which allows non-set 
atoms (individuals). 

For more on NF and NFU, see Quine [61], Fraenkel, Bar-Hillel, and Levy [21], 
Forster [19], and Holmes [31]. 
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21.10 Further Reading 


Part IV was a rather brief and sketchy introduction to axiomatic set theory, so the 
reader is encouraged to consult the excellent works given below. 

Alternative treatments of some or all of the topics covered in the first parts 
of our book, following a naive informal approach (and with little or no coverage 
of axiomatic systems at all), can be found in the following older classic works: 
Hausdorff [29], Kamke [36], Sierpinski [73], Russell [68], Fraenkel [20], and 
Kuratowski and Mostowski [46]. 

For more coverage on the formal systems of ZF or VNB, see Stoll [76], 
Suppes [77], Bernays [3], Devlin [13], Levy [48], Hrbacek and Jech [32], Fraenkel, 
Bar-Hillel, and Levy [21], Rotman and Kneebone [65], Halmos [27], Enderton [15], 
Vaught [79], Moschovakis [54], Kunen [43], Just and Weese [35], Bourbaki [5], 
Hajnal and Hamburger [26], Hamilton [28], Schimmerling [70], and Goldrei [24]. 

The original work of Cantor and Dedekind can be found in Cantor [6] and 
Dedekind [12]. An excellent collection of primary sources on the historical devel- 
opment of axiomatic set theory and logic is van Heijenoort’s From Frege to 
Gédel: A Source Book in Mathematical Logic, 1879-1931 [78], where one can find 
English translations of Zermelo’s and von Neumann’s original papers introducing 
their axiomatic set theories, as well as subsequent enhancements of their systems 
by Skolem, Fraenkel, and Bernays. Zermelo’s 1908 paper [85] still serves as an 
excellent introduction to his system. 

For Part III, the introductory topics on the basic topology of R can be found in any 
standard real analysis text. An excellent review of the analogies between Lebesgue 
measure and Baire category is Oxtoby [58]. The topic of Borel and Analytic sets 
belongs to the area of Descriptive Set Theory, for which two standard modern 
references are Kechris [38] and Moschovakis [55]. Some of these topics are also 
covered in the older texts of Hausdorff [29], Sierpinski [74], Kuratowski [45], and 
Kuratowski and Mostowski [46]. See also Rogers [64]. 

More advanced treatments of set theory covering topics such as Gédel’s con- 
structible universe L and Cohen’s 1963 technique of forcing for obtaining inde- 
pendence proofs (which revolutionized modern set theory and has been in constant 
use since then) require some background in mathematical logic. A basic early text 
on this topic is Cohen [8], while two highly standard references with expositions 
of constructibility and forcing are Kunen [42] and Jech [34]. Bell [2] focuses on 
Boolean-valued models. In 2011, a new rewritten version [44] of Kunen’s 1980 
book has been published. 

For the topic of large cardinals, Kanamori [37] is a current standard reference 
(more forthcoming volumes expected), but the encyclopaedic Jech [34] and the older 
Drake [14] are also helpful. 

A volume containing many interesting articles by set theorists is Link [49]. A 
recent handbook containing highly advanced technical surveys of current research 
in set theory is [18]. 


Chapter 22 
Postscript IV: Landmarks of Modern Set Theory 


Abstract This part contains brief informal discussions (with proofs and most 
details omitted) of some of the landmark results of set theory of the past 75 
years. Topics discussed are constructibility, forcing and independence results, large 
cardinal axioms, infinite games and determinacy, projective determinacy, and the 
status of the Continuum Hypothesis. 


Note: Many of the topics discussed below are metamathematical in nature and 
so their precise and rigorous definitions depend on mathematical logic, which is 
beyond the scope of this text. Therefore the descriptions below are necessarily 
sketchy and incomplete. Most of the details can be found in Jech [34], Kunen [42, 
44], and Kanamori [37]. 


22.1 Gédel’s Axiom of Constructibility 


All efforts to settle the Continuum Hypothesis by late nineteenth and early twentieth 
century mathematicians failed. Then, in the late 1930s, Gédel made a major 
breakthrough by introducing the notion of constructible sets and the axiom of 
constructibility. We now briefly describe Gédel’s results [22]. 


Relativization 


Let C be a given class (which can be a set or a proper class). Then for each ZF- 
formula 6 = $(x1,x2,...,Xn), the relativization of @ to C, denoted by #© = 
f° (x1,X2,...,Xn), is obtained by restricting all the quantifiers in to range over 
C. For example if o is ‘Vxdy(x € y)’ then o© is ‘Vx € C Jy € C (x € y)’. This 
can be formally defined by recursion on formulas by taking © = 9 if # is an atomic 
formula, (6 A W)o = 6° AW, (=¢)© = 76%), and (Av )© = AV EC APS). 


A. Dasgupta, Set Theory: With an Introduction to Real Point Sets, 399 
DOI 10.1007/978-1-4614-8854-5__22, © Springer Science+Business Media New York 2014 


400 22 Postscript IV: Landmarks of Modern Set Theory 


Definition 1301. Let A be a set. We say that a subset E C A is definable from 
parameters in A if there exist a ZF-formula $(x, x1,%2,...,X,) and elements 
a,,d2,...,Qy, € A such that 


E={xeéA| 4 (x, a1, d2,...,4n)}. 


Note: This notion can actually be defined formally in ZF (Tarski—Gédel). 


Definition 1302. Def(A) denotes the collection of all subsets of A which are 
definable from parameters in A. 


It is clear that Def(A) C P(A), and @, A itself, and all finite subsets of A are 
members of Def(A). Thus if A is finite then Def(A) = P(A). On the other hand, if A 
is infinite then | Def(A)| = |A| but | P(A)| > | A] so Def(A) is a relatively small part 
of P(A) when A is infinite. In addition, if A is transitive, then A C Def(A) € P(A), 
and so Def(A) is itself transitive. 

We now define the hierarchy of constructible sets by transfinite recursion. 


Definition 1303 (The Constructible Hierarchy). Define 


Io :=@, Lat :=Def(Le), La = |) Lp_ if ais limit. 


B<a 


A set A is said to be constructible if A € Ly for some ordinal a. The class of all 
constructible sets is denoted by L. 


The axiom of constructibility says that every set is constructible. Since the class of 
all sets is denoted by V and the class of constructible sets by L, so the axiom of 
constructibility can be stated as “V=L.” 

Note that each Ly is transitive, and Ly C Vg for alla. Also, since Def(A) = P(A) 
whenever A is a finite set, so we have Ly = V, fora < w. It follows that L,, = V,. 

However, L»+1 is much smaller than V,,.1, since Ly+; = Def(L,) is countable 
while |V41| = | P(Va)| = 2®0. Next, L,.+2 is still countable but [Voto| = 2 In 
fact, Ly stays countable for all a < @, while the sets V, grow enormously in size! 
On the other hand, for every ordinal w we have a € Ly4;~Ly anda € Vy41~Vq, 80 
L, and V, have the same rank or “height,” i.e., Ly is as “tall” as V,. This gives the 
following picture. 
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Godel proved two major facts about constructibility: 


1. Both the Axiom of Choice (AC) and the Generalized Continuum Hypothesis 
(GCH) can be derived from ZF augmented with the axiom of constructibility, 
1.€., 


ZF-L V=L > AC + GCH, 


where we write “ZF F o” to mean that o can be formally derived in ZF. 
2. The axiom of constructibility is relatively consistent with ZF, i.e., if ZF is 
consistent then so is ZF + V=L. 


As an immediate consequence, we have the following: 


Theorem 1304 (Godel). Jf ZF is consistent then so is ZF + AC + GCH. 
In particular, the Continuum Hypothesis cannot be disproved from ZFC, unless 
ZF itself is inconsistent. 


We now sketch how Gédel’s results can be derived. 

First note that if a transitive set A can be well-ordered, then we can also 
well-order Def(A) effectively from the well-order on A. This is because the ZF- 
formulas, countable in number, can be effectively enumerated and the set A* of 
finite sequences of parameters from A can also be effectively well-ordered (from 
the well-order on A). This way, all the sets L, can be well-ordered in a uniform and 
effective fashion such that if a < B then L, is an initial segment of Lg, which gives 
an effective global well-order on all of L. Hence if V=L, then the class of all sets gets 
equipped with a global well-order and the axiom of choice immediately follows. 

To get an idea on how V=L implies the Continuum Hypothesis, note that we have 
| Lo, | = &1 effectively using the nice well-order of L described above. Gédel also 
proved that P(w) NL C L,,, (we will not prove this fact here). Hence, if V=L, then 
P(@) C Ly,, so | P(w)| < &1, and CH follows. 

GCH is derived from V=L in an exactly similar fashion. 

Finally, to prove the second part (relative consistency of V=L with ZF), Gédel 
first established that for every ZF-sentence o: 


If ZF o then ZFE o”. 
This is expressed by saying that “ZL is a model of ZF” Gédel then showed: 
ZF b (V=L)‘, 


which is expressed by saying “L is a model of V=L.” From these two facts, it easily 
follows that if ZF is consistent then so is ZF+V=L, or equivalently that VAL cannot 
be derived from ZF unless ZF itself is inconsistent. To see this, suppose that ZF - 
-=(V=L). Then, since L is a model of ZF, we would get ZF / —(V=L)". But we saw 
above that ZF | (V=L)‘, so ZF is inconsistent. 
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For the details that our proof-sketch has left out, see [34, 42], or [44]. 


The axiom of constructibility (V=Z) is a very strong axiom which settles many 
unsolved problems of set theory. For example, V=L implies 4, and so the Suslin 
Hypothesis is false under V=L. 

The axiom of constructibility also decides Lusin’s important unsolved questions 
about regularity properties of the projective sets, but the answers are rather negative 
(for a class of effectively defined sets). Essentially, the highly effective well-ordering 
present under V=L can be used to produce “pathological” Bernstein sets in the low 
levels of the projective hierarchy, giving: 


Theorem 1305 (Godel). Jf V=L, then there exist PCA ( pap ) sets which are not 
measurable and do not have the Baire property, and there exist uncountable 
coanalytic (Tl re sets which do not have any perfect subset. 


22.2 Cohen’s Method of Forcing 


After Godel proved his relative consistency result, the problem of independence of 
CH remained open! until 1963, when Paul Cohen showed that ZFC cannot prove 
CH either (assuming ZF is consistent). The Gddel—Cohen results are known as 
the independence of the Continuum Hypothesis. Cohen’s proof introduced a new 
method called forcing, which immediately flourished as an extremely powerful 
and versatile technique for obtaining general independence results. Since then, the 
method of forcing has been extended in many ways, and a vast body of independence 
results have been obtained. Forcing remains the most fruitful tool for building 
models of set theory. 

Forcing is best understood in the context of models of set theory, where it is 
viewed as a method for extending a given model. This is beyond the scope of this 
text, and we will only give a bare bones sketch of forcing in purely syntactic terms 
with most of the details left out. Modern expositions of forcing are [2, 34,42], and 
[44]. Cohen’s original text is [8]. 

A forcing poset (P, <, 1) is a partial order with a largest element 1. Given such 


a poset, define the (ranked) class V” of P-names as Mul ees Ve, where 


Vo :=@, Vay t= P(E XP), «Ve = (JV for limita. 


E<a 


A P-sentence is a ZF-formula in which all free variables have been replaced by 
P-names. In particular, each ZF-sentence is a P-sentence. One can then define, for 
each P-sentence o, the forcing relation (read “p forces 0”): 


'Gédel’s method of showing relative consistency, known as the method of inner models, cannot be 
used to show the relative consistency of the negation of CH (or of the negation of any statement 
provable from V=L). The reason is that the only inner model of L containing the ordinals is L itself. 


22.2 Cohen’s Method of Forcing 403 


pi-po (p € P), 


first for atomic P-sentences by a suitable recursion on the P-names J, v as: 


° p\l-p uw =v & Vp € dom(z) U dom(v) Vq < pig lkp pe wo gq |Fp 

perv), 
° pltpmev<s Vq x par <qi(p,s)€ev(rxsar lp pu =p), 
and then for more complex P-sentences by: 


« pl-paAt<}plKpoA plFpt, 
* plrp ro @ 7Aq X pq Ire o), 
° pl-p dxd(x) & Vq X par x qav ec V"(r Hp 6(0)). 


Let us note that this really is a definition scheme—an infinite list of definitions, one 
for each P-sentence o. 

We can see that p Ik-p -o > —(p |Fp o), andg < pandp |-po > q |lFro, 
but it takes a lot of work to establish the following theorem-scheme of ZFC: 


(*) If ZFC o, then ZFC 1IFpo. 


The P-names are thought of as “labels” for a “‘virtual extension” of the universe 
V. There are a lot of duplications in the P-names, but one can identify duplicate 
P-names using the equivalence p ~p v © 1 |Kp pw = v (for u,v € V"). For 
each x € V, one can define a canonical P-name ¥ € V" for x by the recursion 
X := {(¥,1)| y € x}. Then it can be verified that the mapping x > [Xx]~, “embeds” 
the universe V into the “virtual extension” V"/~p. Here it is convenient to think of 
the set {p € P| p |p o} as the “generalized truth value” for the P-sentence o, so 
that by (*) above, the generalized truth value of o equals P (“true”) if ZFC F o, 
and equals @ (“false”) if ZFC F 70. 

Let Fn(/, J) denote the poset consisting of all functions with domain a finite 
subset of J and range contained in J, ordered by reverse inclusion so that f < 
g <> f 2D g and @ is the greatest element. Let P := Fn(w x an, {0, 1}). Cohen 
showed that, for this poset P, 


1 lKp 2% > wy, ie lHp — CH. 


In other words, the generalized truth value of CH is “false” (@) under the poset 
(Fn(@ x @, 2), D, @). It follows that CH is not a theorem of ZFC, since otherwise 
we would have both 1 IF p CH and — (1 |Fp CH) (since 1 Ikp — CH gives — (1 Ip 
CH)), implying that ZFC is inconsistent. This gives: 


Theorem 1306 (Cohen). /f ZFC is consistent then so is ZFC + —CH. 
In particular, the Continuum Hypothesis cannot be proved from ZFC, unless ZFC 
itself (and so ZF as well) is inconsistent. 


The combinatorial properties of a forcing poset (P, <, 1) are crucial in determining 
which statements will be forced, and a great variety of independence results have 
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been obtained using various kinds of forcing posets. Forcing can also be used to 
show the independence of AC, and the relative consistency of “28° = &,” for any 
a so long as Xq has uncountable cofinality. 

The forcing method has been extended considerably, using which, e.g., the rela- 
tive consistency of MA+not-CH was obtained. This shows that Suslin’s Hypothesis 
(SH) is relatively consistent with ZFC. Combined with the fact that SH is false under 
V=L, we see that SH is independent of the ZFC axioms. 

In Postscript II, we mentioned that MA+not-CH implies that all z (PCA) sets 
are measurable and have the Baire property. Combined with Gédel’s Theorem 1305, 
it follows that the measurability (and Baire property) of & : sets is independent of 
ZFC. This shows that Lusin’s conviction that these problems are unsolvable was 
correct, and that Lusin and the mathematicians of his time had reached the limits of 
what could be proved about projective sets using the usual axioms of set theory. See 
also Theorem 1308 below. 

We conclude our brief discussion of forcing by stating a landmark result obtained 
by Solovay using the method of forcing. Let I denote the assertion that there is an 
inaccessible cardinal. 


Theorem 1307 (Solovay). Jf ZFC+I is consistent, then so is ZF+DC together with 
all of the following assertions: 


1. All subsets of R have the perfect set property. 
2. All subsets of R are Lebesgue measurable. 
3. All subsets of R have the Baire property. 


Solovay also proved the following consistency result about the projective sets: 


Theorem 1308 (Solovay). Jf ZFC+I is consistent, then so is ZFC+GCH together 
with all of the following assertions: 


1. All projective sets have the perfect set property. 
2. All projective sets are Lebesgue measurable. 
3. All projective sets have the Baire property. 


Theorems 1307 and 1308 raised the question whether the assumption of the 
existence of an inaccessible cardinal in the hypothesis of the theorems was really 
necessary. It was already known that conclusion (1) does need the assumption 
of an inaccessible (since the perfect set property for coanalytic sets implies the 
relative consistency of inaccessible cardinals with ZFC, see Theorem 1310 below). 
Surprisingly, Shelah later proved that in both theorems (2) needs the inaccessible 
assumption, but (3) does not! 


22.3 Gédel’s Program and New Axioms 


Extending Cohen’s results, Solovay showed that any assertion of the form 28° = X, 
is consistent relative to ZFC, so long as q@ is a successor ordinal or has uncountable 
cofinality. Thus any one of these statements can be taken as an additional axiom to 
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get a formal extension of set theory. This suggests the consideration of two possible 
approaches to the Continuum Problem. 

The first view, sometimes called pluralism or formalism, is that there is no 
pre-existing intrinsic reason to prefer any of these assertions over another. Cohen 
himself expressed support for this view. Pluralism is applicable not only to the 
Continuum Problem, but also to any of the many problems known to be independent 
of the ZFC axioms. Formalists may regard the study of the multitude of possible 
axiomatic set theories as new human constructions or inventions that had never 
existed before. 

The other view, supported by Gédel himself, is that the ideal universe of sets 
exists in a reality which is independent of axioms. Gédel believed that “in this 
reality, Cantor’s conjecture must be either true or false, and its undecidability 
from the axioms as known today can only mean that these axioms do not contain 
a complete description of this reality” [23]. This view is thus sometimes called 
platonism or realism. Gédel suggested a search for the discovery of new natural 
axioms of set theory” which will be powerful enough to determine the “correct truth 
value” of problems currently known to be independent of ZFC. This is known as 
Gédel’s Program.° 


22.4 Large Cardinal Axioms 


One possible candidate for a new axiom could be the axiom of constructibility 
(V=L). We saw that V=L is powerful enough to settle many of the major undecidable 
problems of set theory such as CH and SH, and, as Jensen points out, is a form of 
Occam’s razor since it denies the existence of any set other than the constructible 
ones. It gives a very “narrow” universe of sets. 

Very different from the axiom of constructibility are large cardinal axioms or 
axioms of strong infinity. Existence of a large cardinal implies the consistency of 
ZF. For example, let I denote the assertion that there is an inaccessible cardinal. 
Then it can be shown that 


ZFC+I > Con(ZFC), 


where “Con(ZFC)” stands for “ZFC is consistent.” By a result known as Gédel’s 
second incompleteness theorem on unprovability of consistency, existence of such 
cardinals (or even the relative consistency of their existence) cannot be proved in 
ZFC. 


*Similar to the search for discovering true principles in physics. 


3Set theorists differ widely on these matters, and pluralists and believers of Gédel’s program 
represent only two of many possible viewpoints. Feferman has expressed that the Continuum 
Hypothesis is not even a definite mathematical problem. See [16] for a panoramic debate, [50] 
for some background, and [51] for more references. See also the EFI project web site http://logic. 
harvard.edu/efi.php. 
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The smallest large cardinals are the inaccessible cardinals, but we have also met 
two other types in the earlier postscripts, namely the weakly compact cardinals 
encountered in infinitary combinatorics, and the measurable cardinals that arose 
in Ulam’s work on extensions of Lebesgue measure. One can use Gédel’s second 
incompleteness theorem again to distinguish between “strengths” of large cardinal 
axioms. For example, let M denote “there is a measurable cardinal” and W denote 
“there is a weakly compact cardinal.” It can be shown that ZFC+M — Con(ZFC+W) 
and ZFC+W — Con(ZFC+I). Hence (the existence of) a measurable cardinal is 
strictly stronger in consistency strength than (the existence of) a weakly compact 
cardinal, which in turn is strictly stronger than (the existence of) an inaccessible. 

Now let PP denote “every projective set has the perfect set property.” Then from 
Solovay’s results it follows that Con(ZFC+I) is equivalent to Con(ZFC+PP), and so 
the perfect set property for projective sets is equiconsistent, relative to ZFC, with 
(the existence of) an inaccessible. 

Most set theorists, starting from the inventor Gédel himself, find the axiom of 
constructibility to be highly unacceptable as an axiom. Gédel’s results showed 
that the axiom of constructibility does answer Lusin’s question about regularity 
properties of pe (PCA) sets, but in a “negative” way: If V=L then there are be 
sets which are not Lebesgue measurable, there are uncountable Tl; (coanalytic) sets 
without perfect subsets, etc. More generally, most set theorists find the restriction on 
set existence placed by the axiom of constructibility as too severe to be acceptable. 

On the other hand, large cardinal axioms in general have been far more attractive 
to set theorists. They often resolve problems of ordinary mathematics in more 
“pleasant” ways. For example, recall Solovay’s result: 


Theorem 1309 (Solovay). If there is a measurable cardinal, then all Dae sets have 
the perfect set property, are measurable, and have the Baire property. 


Thus constructibility and large cardinals seem to be naturally opposed to each 
other:* In the low levels of the projective hierarchy, the former implies some 
pathological phenomena, while the latter is intimately connected with regularity 
properties. In fact, we have the following partial reversal: 


Theorem 1310 (Solovay). Jf all uncountable M1; (coanalytic) sets have perfect 
subsets, then at most countably many real numbers are constructible and the 
existence of inaccessible cardinals is relatively consistent with ZFC. 


This indicates that large cardinals beyond ZFC are necessary for establishing the 
perfect set property for the higher projective classes, further vindicating Lusin’s 
conviction that the regularity properties enjoyed by the analytic sets would be 
impossible to extend to the higher projective classes (using the usual axioms of set 


4An earlier result of Scott had shown that the axiom of constructibility contradicts the existence 
of measurable cardinals. Gaifman, Rowbottom and Silver dramatically improved Scott’s result to 
show that if a measurable cardinal exists then in a certain sense the vast majority of sets must be 
non-constructible. 
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theory). It led Solovay to conjecture that stronger large cardinal axioms will imply 
regularity properties for all projective sets—a conjecture that was spectacularly 
confirmed through later works of set theorists such as Martin, Steel, and Woodin. 


22.5 Infinite Games and Determinacy 


Closely related to large cardinal axioms in this regard are the axioms or principles 
of determinacy. Determinacy provides the key to understanding why large cardinals 
imply regularity properties for projective sets. 

Given A C NN, consider the game G(A) played by two players I and II 


alternately choosing natural numbers x), x2, .x3,... as follows, with Player I going 
first: 

I | xy x3 x5 x 

Il | X2 X4 X6 Xg 


The resulting sequence x = (x1,X2,%3,...) € NN is called a play or run of the 
game, and we declare this play x to be a win for Player I if x € A; otherwise we 
say that the play x is a win for Player II. 

A strategy for Player I is a function o: {u € N* | len(w) is even} — N, and given 
a play x = (x1,x2,...) € NN we say that J plays according to o or follows o if for 
all even 1, X,41 = O((X1,X2,...,Xn)). We say that o is a winning strategy for I if 
every play following o is a win for Player I, i.e., I always wins by playing according 
to o, no matter what II plays. 

The corresponding notions for Player II (strategy t for Player II etc) are similarly 
defined. 

A game G(A) (or the set A) is said to be determined if either I or II has a winning 
strategy, i.e., if one of the players can force a win no matter how the opponent plays. 

It can be shown that games with only finitely long plays are always determined; 
but for an arbitrary (infinite) game it is not at all clear that it will necessarily be 
determined. 

We will now make two identifications: 


* We will identify NN with the real interval (0, 1] using the bijective mapping 
H: NN -; (0, 1] (Problem 421) given by: 


1 1 1 


H((11,12,73,..-)) = 2m + Qnitn2 + Qnitn2+n3 + 


¢ The reals R can be identified with the open interval (0, 1) via some very effective 


y 1 x 
homeomorphism such as x +> 5 + IESE 


We can therefore talk about games G(E) where E is a subset of (0, 1] or of R (by 
“transferring” the set E to a subset of NN via the above identifications). 
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Problem 1311. /f A is countable then II has a winning strategy in G(A). 


The axiom of determinacy (AD), first introduced by Mycielski and Steinhaus 
in 1962, says that every set of reals is determined. Properties of the axiom of 
determinacy were studied by Mycielski [56]. It is a very powerful axiom which 
implies regularity properties for all sets: Under AD, all sets of reals have the perfect 
set property, are measurable, and have the Baire property. Thus, AD is incompatible 
with the Axiom of Choice, but as an alternative to AC it is an extremely interesting 
axiom with some surprising implications. For example, Solovay proved that under 
AD, & is a measurable cardinal! 


Problem 1312. Let A C (0, 1]. If 1 has a winning strategy in G(A), then A has a 
perfect subset. If II has a winning strategy in G(A), then the complement of A has a 
perfect subset. 


[Hint: If I has a winning strategy o in G(A), let P C (0, 1] be the set of reals 
corresponding to all plays according to o in which I always plays | or 2: 


P := {H(x)| For all 1, x2n-1 = O((X1, X2,..-,X2n—2)) and X2, € {1, 2}}. 


Then P is a perfect subset of A.] 
Corollary 1313. If B is a Bernstein set then G(B) is not determined. 


Recall, however, that the construction of a Bernstein set is highly non-effective and 
requires heavy use of the full axiom of choice. We therefore consider games G(A) 
with A restricted to some natural class of effectively defined sets—such as open, 
Borel, analytic, projective, etc—and ask if such games are necessarily determined. 
A very basic and early result in such restricted definable determinacy principles is 
the Gale—Stewart Theorem: 


Theorem 1314 (Gale-Stewart 1953). Every open game is determined. Every 
closed game is determined. 


Roughly speaking, the more effectively a set is defined, the easier it is to establish 
that it is determined. Thus, it is somewhat harder to prove that F, and Gs games 
are determined, and still harder to prove that F,s and Gs, games are determined. 
Work of Harvey Friedman indicated the reason behind such increasing levels of 
difficulty: To establish determinacy for each additional level of the Borel hierarchy 
one needs the existence of an additional level of the cumulative hierarchy of sets 
V, for aw > w. Gale and Stewart had asked if all Borel games are determined, and 
by Friedman’s result establishing Borel determinacy would require an uncountable 
number of iterations of the power set operation all the way through V,,,. This means 
Borel determinacy is a result that cannot be established in Zermelo set theory Z with 
choice (i.e., ZFC minus the replacement axiom). Martin established the celebrated 
result: 


Theorem 1315 (Martin 1975). Every Borel game is determined. 
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Borel determinacy is the first major mathematical result provable in ZFC that 
requires the full strength of ZFC via essential use of the replacement axiom. It is 
the strongest determinacy principle for a natural class of definable sets that can be 
proved in ZFC. Determinacy of analytic (2 1) games, as we will see now, requires 
stronger (large cardinal) assumptions. 

Even before Borel determinacy was proved, work of Martin, Kechris, and 
Solovay had established fundamental connections between determinacy and large 
cardinals. We already mentioned Solovay’s result that full AD implies that &; is a 
measurable cardinal. The following two theorems show that determinacy of analytic 
games interpolates in between the hypothesis and conclusion of Solovay’s earlier 
theorem Theorem 1309. 


Theorem 1316 (Martin). [fa measurable cardinal exists, then all analytic games 
are determined. 


Theorem 1317. [fall analytic games are determined, then all X : sets are measur- 
able, have the perfect set property, and the Baire property. 


By Theorem 1310, the perfect set property for X sets implies relative consistency 
of inaccessible cardinals; hence, by Theorem 1317, analytic determinacy implies 
the consistency of inaccessibles as well, and so cannot be proved in ZFC. We have 
thus encountered a definable determinacy principle for a naturally arising class of 
effectively defined sets which is inextricably linked to large cardinals. 

Actually, analytic determinacy implies relative consistency of much larger 
cardinals, in fact, larger than weakly compact cardinals. On the other hand, the 
hypothesis of measurable cardinals in Theorem 1316 is too strong, and analytic 
determinacy can be derived from smaller cardinals (see [57] for such a proof). By 
an exact characterization due to Martin and Harrington (in terms of so called sharps 
or Silver indiscernibles) analytic determinacy has consistency strength lying strictly 
between weakly compact and measurable cardinals.(Going further, determinacy of 
=. sets implies regularity properties for bs sets and has much stronger consistency 
strength, entailing the relative consistency of many measurable cardinals.) 

These results indicate that determinacy principles provide the key for obtaining 
regularity properties for projective classes, and they themselves represent a form of 
large cardinal axioms. In fact, it turns out, beautifully, that determinacy principles 
establish a correlation between the projective hierarchy and large cardinal axioms 
such that determinacy for larger projective classes corresponds to stronger large 
cardinal axioms. 


22.6 Projective Determinacy 


Using determinacy principles, Theorem 1317 can be generalized through the entire 
projective hierarchy: 
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Theorem 1318 (Kechris—Martin, after Mazur, Banach, Mycielski, Swier- 
czkowski, Davis). [f all 5. games are determined then all Dee sets are 
measurable, have the perfect set property, and have the Baire property. 


The assumption that all projective games are determined is known as Projective 
Determinacy (PD). Thus under PD, the regularity properties of analytic sets extend 
to all the higher projective classes—a result that Lusin believed (correctly) would 
be impossible to obtain using the ordinary axioms of mathematics. 


Corollary 1319. [fall projective games are determined, then all projective sets are 
measurable, have the perfect set property, and have the Baire property. 


Another line of development which uses projective determinacy concerns structural 
properties of the projective classes. We had proved the Lusin separation theorem 
(Theorem 1151) for the class of analytic sets, or pe The strongest classical 
separation theorem was for the class H o In 1967, Blackwell found a proof of the X : 
separation theorem using determinacy of closed games. Assuming determinacy of 
projective sets, Addison, Martin, and Moschovakis generalized Blackwell’s result 
through the entire projective hierarchy, and the separation property was found 
precisely in the classes ae and II, (n = 1,2,...)0 

Thus PD gives a complete structure theory for the projective classes, i.e., the 
entire theory of projective classes takes a remarkably canonical and coherent form 
under PD, with all questions about regularity and structural properties settled in an 
intuitively desirable and natural fashion. We can speculate that, perhaps, this is the 
best that Lusin could have hoped for. 

The optimal large cardinal notion that implies determinacy for the projective 
classes is that of a Woodin cardinal. We will not define Woodin cardinals here 
(see [37] or [34] for a definition), but state the following seminal result:° 


Theorem 1320 (Martin-Steel 1985). If there are n Woodin cardinals and a 
measurable cardinal above them all, then all ee games are determined. 

If there are infinitely many Woodin cardinals, then all projective games are 
determined.’ 


In the other direction, we have the following results. 


Theorem 1321. ©} 44-determinacy implies the relative consistency of the existence 
n Woodin cardinals. Therefore, projective determinacy implies, for eachn € N, the 
consistency of the existence of n Woodin cardinals. 


>Other stronger structural properties that we have not defined (such as reduction, pre-well ordering, 
uniformization, and scale) hold in the dual (opposite) classes. 

‘Deep research by several set theorists including Martin, Steel, Kechris, Foreman, Magidor, 
Shelah, and Woodin, culminated in the final ideas and results. 

™Woodin showed that with a marginally stronger hypothesis (existence of a measurable cardinal 
above infinitely many Woodin cardinals) the determinacy of a much larger class of sets (than the 
projective sets) called L(R) can be established. 
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Theorem 1322 (Woodin). The full Axiom of Determinacy (AD) is consistent with 
ZF (without Choice) if and only if the existence of infinitely many Woodin cardinals 
is consistent with ZFC, 


These results further confirm our earlier statement that determinacy is a form of 
large cardinal axiom, via an almost perfect “correlation of strength” through the 
projective classes. 

The remarkable results above (and many others that were not mentioned) indicate 
why most set theorists find that the axiom of projective determinacy (as opposed to 
V=L) gives the “true and correct” picture for the projective sets, and therefore can 
be regarded as a truly natural strong axiom vindicating Gédel’s program—as far as 
the theory of projective sets is concerned. 


The situation for CH is far more complex. 


22.7 Does the Continuum Hypothesis Have a Truth Value? 


As mentioned earlier, the Continuum Problem, which was first on Hilbert’s famous 
list, is widely regarded as the greatest problem of set theory. It has remained 
unsettled after more than a hundred years of attack. Moreover, unlike Lebesgue 
and Banach’s Measure Problem or Lusin’s Problem involving the projective sets, 
the Continuum Problem cannot be resolved using the usual type of large cardinal 
hypotheses.® 

Of course, pluralists (formalists) may not think that CH can ever be decided, and 
some of them may think that the Gédel—Cohen independence results have settled 
the matter for ever. For many pluralists, CH does not have an absolute or intrinsic 
truth value. For some, it may not even be a well-defined mathematical problem. 

Supporters of Gédel’s program, on the other hand, keep searching for strong 
natural axioms which might decide CH. Most of the known axioms which decide 
CH (such as V=L and Martin’s Maximum) are not considered sufficiently natural 
to be acceptable. Thus the Continuum Problem, unlike the theory of projective 
sets, remains open from the perspective of Gédel’s program. However, some highly 
sophisticated recent work of Woodin and others has made the problem more 
tantalizing than ever by arguing that natural axioms settling the Continuum Problem 
may be around the corner. This has been a topic of much discussion (and debate) 
among set theorists, who differ widely in their mathematical and philosophical 
approaches to CH. For a general survey of this large subject, see Koellner’s article 
[87] on CH in the online Stanford Encyclopedia of Philosophy. 

The recent EFI project (Exploring the Frontiers of Incompleteness) of Koellner 
brought together major thinkers in a workshop on this foundational debate.’ 


8This was shown by Cohen, Levy, and Solovay. 


°The web site http://logic.harvard.edu/efi.php has more information and resources. The project is 
funded by a grant from the John Templeton Foundation. 
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22.8 Further References 


Introductory accounts of constructibility can be found in [14, 34, 42,55], while 
Gédel’s original presentation is [22]. 

Standard references for learning forcing are [2, 34,42, 44], but Cohen’s original 
text [8] is still in print. 

For large cardinals, the definitive reference is [37] (see also [34]), but the older 
[14] is helpful as well. 

The theory of determinacy of infinite games is covered in [34, 37, 55]. Inviting 
introductions to this area can be found in [53,57]. 

In addition to Koellner [87] mentioned above, discussions on some of the recent 
approaches to the Continuum Problem are in [1, 17,52], and in Woodin’s own 
expository articles [82, 83]. 

A handbook containing highly advanced up to date surveys of current research 
in set theory is [18]. 


Appendix A 
Proofs of Uncountability of the Reals 


In this appendix, we summarize and review the proofs of uncountability of the reals 
given in the main text, and indicate how the methods of these proofs generalize and 
connect to other areas of mathematics. (This appendix is not an exhaustive list of 
such proofs.) 

There were essentially three distinct proofs of uncountability of the reals given 
in the text. All proofs depend, in the end, on some form of order completeness of 
R, but they take very different forms and generalize in different ways to give other 
significant results in mathematics. 


A.1 Order-Theoretic Proofs 


Section 8.5 presented a proof of uncountability of the reals which follows imme- 
diately from Cantor’s powerful theorem characterizing the order type 7 (which 
says any countable dense order without endpoints has order type 7). That theorem 
also implies 7 + 7 = n, and so any countable dense order must have Dedekind 
gaps. Hence any dense linear order without Dedekind gaps, such as R, must be 
uncountable. 

This proof is so short because it exploits a very powerful result of order theory. 
It is related to Cantor’s first proof of uncountability of R, which directly shows that 
a countable dense order cannot be complete: 


Proof (Cantor’s first proof of uncountability of R). To get a contradiction, suppose 
that the set of real numbers can be enumerated as p, p2,... (without repetition). 
Recursively define two sequences of reals (a,,) and (b,,) with 


A, <2 <5 < Ay <0) tt <b <i <ln < by, 


in the following manner. Let aj = p,, and b; = pm where m is the least index such 
that a; < p». Having defined a), a2,...,d, and b;,b2,...,b, with a, < by, define 
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an4+1 = pj; where 7 is the least index such that a, < p; < b, and by4, = pe 
where k is the least index such that a,4; < py < b,. Then we have dy, < day41 < 
bn+1 < by, and the recursive definition is complete. In particular, for each n we 
have a, = p;, and b, = px, for some indices j, and k,,. Now, by completeness of 
R, there must be a real number p such that a, < p < b, forall, andso p = p; for 
some 7. Since the indices j,, are all distinct, we can fix n with j,4, > i. Note that 
ay < pj < 6, and by definition of a,4; = pj, 41> we see that 7,41 equals the least 
index j such that p; lies between a, and b,, and so j,41; <i,acontradiction. O 


This was Cantor’s first published proof of the uncountability of R. Given any 
enumeration of a countable dense order, it effectively produces a gap in it. 


Both the proof of Sect. 8.5 based on Cantor’s theorem characterizing the order 
type 7 and Cantor’s first proof given above appeal to order completeness, but note 
that full completeness is not necessary. For both proofs, it suffices to assume that 
there are no (w, *w) gap in the ordering. 


Proposition 1323. A dense order without (w,*w) gaps has cardinality > Xo. 
In this form, the proof generalizes to 1; orders without (@, *w1) gaps: 
Proposition 1324. Any n; order without (@,*@,) gaps has cardinality > &,. 


Proof. Recall that any two 7 orders of cardinality 8; must be isomorphic to each 
other. If there were an 7; order X of cardinality 8; without (@), *w) gaps, then any 
suborder Y of X obtained by removing a single point of X would also be an 7; 
order of cardinality 8; and so must be isomorphic to X. But Y has a (@1, *@) gap, 
and so X has such a gap, a contradiction. oO 


Another related generalization is this: Any dense-in-itself complete order contains 
an isomorphic copy of the real line and so has cardinality = c. 


Connected spaces and their uncountability. As mentioned in the text, the notion 
of connectedness in topology is a direct generalization of Dedekind’s definition of 
linear continuum: An order is a continuum if and only if in any Dedekind partition 
of the order at least one of the sets contains a point which is a limit point of the 
other. A metric or topological space is connected if and only if for any partition of 
the space into two nonempty sets, at least one set contains a limit point of the other. 
Under certain regularity conditions, the uncountability of linear continuums carries 
over to connected spaces. To see this, note that the Intermediate Value Theorem 
generalizes: The range of any continuous function from a connected space to an 
order must be a linear continuum. Since the distance function on a metric space is 
continuous, any connected metric space with at least two points is uncountable. ! 


'By a basic topological result known as Urysohn’s Lemma, this generalizes to any T, (normal 
Hausdorff) topological space, and in fact to any 73 (regular Hausdorff) space: Any connected T3 
space with at least two points must be uncountable. All these generalizations are thus related to the 
order-based proof of uncountability of R. 
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Cantor discovered his “diagonal method” for proving uncountability several years 
after he obtained his order-based proof given above where he first discovered that R 
is uncountable. Unlike the order-theoretic proofs, the diagonal method is applicable 
in much more general situations where no order may be present. 

In a sense, diagonalization means that given an infinite list of conditions, we 
construct a “counterexample” real number which refutes all those conditions. The 
nested intervals theorem gives a direct version of this form of diagonalization: Given 
a sequence of reals (x1,X2,...), one builds nested closed intervals of shrinking 
length 1; D> I, D--- such that x} ¢ J, (“I; avoids x1”), x2 ¢ In, and so on. The 
unique real x in their intersection then differs from all the given reals x;,x2,.... 
Here the n-th given condition is “x = x,,” and the above method of diagonalization 
via nested intervals produces the real x which satisfies x ~ x, for all n. Therefore, 
we call x the diagonal counterexample for the given sequence (x1, X2,...) of reals. 

In this proof, we could, for definiteness, use the specific scheme for building 
nested closed intervals where the initial interval Jp is the unit interval Jo = [0, 1], 
and each J, is either the left-third or the right-third subinterval of [,-; (whichever 
avoids the real x,, first). The diagonal counterexample will then always be a member 
of the Cantor set, and conversely, any member of the Cantor set can be seen to be 
a diagonal counterexample for a suitably given sequence of reals (x1, X2,...). It 
follows that with this scheme of building nested intervals, the Cantor set is the set of 
all possible diagonal counterexamples to various given sequences of real numbers. 

With a little modification, the above proof of uncountability of R yields the Baire 
Category Theorem, where the n-th condition to be met is to be inside an arbitrary 
given dense open set G,, (instead of the special dense open set of the form {x | 
x # x,}). The Baire category theorem holds in complete metric spaces as well 
as in locally compact Hausdorff spaces, and thus any such space without isolated 
points must be uncountable (and in fact of cardinality at least c). This illustrates 
how Cantor’s diagonal method leads to a powerful general theorem of very wide 
applicability. 

In a more literal form of diagonalization we regard a family (E; |i € E) of 
subsets of a set E indexed by £ itself as the following relation on EF: 


(i,j) € EX E| j € Ei}, 


(or, using the identification via characteristic functions, as a binary array 
(aij |i, j € E)\ where each a;,; is 0 or 1). We then form the diagonal set 
D:= {i € E|i € E;}, and finally take its complement to get the “anti-diagonal” 
settA:= EXD = {i € E |i ¢ E;}, which must differ from all the sets £;. In 
other words, it shows that P(£) cannot be listed as a family of sets indexed by E. 
This is Cantor’s theorem that | Z| < | P(£)|, another far reaching generalization (of 
the uncountability of R) which ensures existence of sets of arbitrarily large infinite 
cardinality. 
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This last version is a more abstract form of diagonalization which is usually 
referred to as the Cantor diagonal method. 

The Cantor set establishes a close connection between these two forms of the 
diagonal method: It is constructed by a “binary tree of nested intervals” in which 
infinite branches (of nested intervals) through the tree correspond, on the one hand, 
to the points of the Cantor set, and, on the other hand, to infinite binary sequences, 
i.e., to members of {0, 1} or to subsets of N. 

One thus obtains a variant of the diagonal proof of uncountability of R by 
identifying the Cantor set with P(N) (or with {0,1}N) and then appealing to the 
abstract Cantor diagonal theorem that | P(N)| > |N|. 

The more abstract version of the Cantor diagonal method has quite wide 
ramifications. It not only gives (via Cantor’s theorem that | P(X)| > |X|) sets of 
larger and larger infinite cardinalities by iterating the power set operation, but also is 
a method used in the proofs of many important theorems of logic and computability, 
such as Gédel’s incompleteness theorem, the unsolvability of the Halting problem, 
and Tarski’s undefinability theorem. 


A.3 Proof Using Borel’s Theorem on Interval Lengths 


In Corollary 1018 it was shown that the interval [a,b] is uncountable using 
properties of lengths of intervals. The length of a bounded interval in R is defined 
by 


len([a, b]) = len((a, b]) = len([a, b)) = len((a,b)) = b—a (a <b). 


The length function thus defined on the intervals has several natural properties 
(which are essential in obtaining the Lebesgue measure on R). For example, the 
lengths of intervals are easily seen to satisfy the condition of finite additivity, which 
says that if an interval / is partitioned into finitely many pairwise disjoint intervals 
qi, hh, AoE g Eis then 


len(J) = len(J,) + len(/2) + ---len(/;,). 


However, the key fact about lengths of intervals used in the uncountability proof 
mentioned above was Borel’s theorem, which says that the interval [a, b], which has 
length b —a, cannot be covered by countably many intervals of smaller total length. 
This important condition is known as countable subadditivity of length, which was 
established (in Borel’s theorem) using the powerful Heine—Borel theorem. Since any 
countable set of reals can be covered by countably many intervals having arbitrarily 
small total length, countable subadditivity immediately implies that a proper interval 
must be uncountable. 

The proof also readily generalizes to more abstract setups as follows. Let X be a 
fixed set. A nonempty collection S of subsets of X is called a semiring on X if for 
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any A, B € S the intersection AN B is in S and the difference ANB can be expressed 
as the union of finitely many pairwise disjoint sets from S. By a set-function on a 
semiring S we mean a function jz defined on S which takes nonnegative extended 
real values (i.e., we allow j(A) to be +00). A set-function jz on a semiring S on 
X is said to be continuous if for every p € X and every € > O there is a set 
E €S with p € E and w(£) < e, and p is said to be countably subadditive on 
S if whenever F € S is covered by countably many sets EF, Ez,--- € S, we have 
H(E) < °°, w(En). Essentially the same proof that a countable set has measure 
zero now immediately gives: 


Proposition 1325. Suppose that [1 is a nonnegative continuous set function on a 
semiring S of subsets of a fixed set X. If 4 is countably subadditive on S, then E is 
uncountable for any E € S for which w(E) # 0. 


Countable subadditivity is necessary here. For example, let X be the set Q of rational 
numbers. By a rational half-open interval we mean a set of the form [a, b) 1.Q with 
a,b € Q. The set of half-open rational intervals forms a semiring on Q on which the 
length function (defined as before) is continuous and finitely additive. But countable 
subadditivity fails and every rational interval is countable. 

We conclude by noting that under finite additivity, the condition of countable 
subadditivity (as in Borel’s theorem) actually entails a much stronger and important 
result known as the measure extension theorem, whose proof can be found in 
any standard textbook of measure theory. By a measure we mean a nonnegative 
extended real valued set-function v defined on a sigma-algebra which vanishes on 
the empty set (v(@) = 0) and which satisfies the condition that if (A,,) is a pairwise 
disjoint sequence of sets from the sigma-algebra then v(U%, An) = Yoo, v(An) 
(countable additivity). 


Theorem 1326 (The Measure Extension Theorem). Let ju be a finitely additive 
nonnegative extended real valued set-function on a semiring S of subsets of a fixed 
set X. Assume that X = U, A, for some sets A, € S with u(An) < oo for all n. 
If pt is countably subadditive on S, then there is a unique measure defined on the 
sigma-algebra generated by $ which extends L. 


Taking S to be the semiring of all real intervals of the form [a, b) and p to be the 
length function on such intervals, we get the following immediate corollary of the 
theorem: There is a unique measure defined on the Borel subsets of R for which 
the measure of any interval is its length. This measure is known as the Lebesgue 
measure, and it also uniquely extends as a measure to the collection of all Lebesgue 
measurable sets (the sigma-algebra generated by the Borel sets together with the 
measure zero sets). 


Appendix B 
Existence of Lebesgue Measure 


This appendix gives a proof of the existence of Lebesgue measure. That is, we prove 
Theorem 1028 whose statement is as below. Recall that E € L, or E is measurable, 
if for all ¢ > O there exist closed F and open G with F C E C G and intervals 
I, l,... covering Gx F with )*len(/,) < e. 

Theorem (Lebesgue). There is m:L — [0, oo] such that 


1. mis countably additive: If A,, A2,... are pairwise disjoint measurable sets, then 


m(U, An) = ae m(A,). 
2. m(/) = len(/) for any interval J (thus m(@) = 0). 


To prove the theorem, we first define the outer measure m*(E) of any set E CR 
(not necessarily measurable), and then restrict m* to L to get m. 


Definition 1327 (Outer Measure). For any FE C R, we define: 


m* (EB) := inf { ye len(n) | (In) is a sequence of intervals covering E}. 


m is m* restricted to L, so if E € L, then m*(E) is denoted by m(E). 


Recall Borel’s theorem (Theorem 1011) which says len(/) < m*(/) for any interval 
I. The following facts are now immediate. 


Problem 1328 (Monotonicity). Jf A C B then m*(A) < m*(B). 
Proposition 1329. For any interval I, m*(I) = len(/). 
Proof. m*(I) < len(Z) is trivial and Borel’s theorem says m*(/) > len(/). Oo 


Proposition 1330 (Countable Subadditivity of Outer Measure). For any 
sequence Ey, E>, ... of sets, m*(U), En) < >>, m* (En). 


Proof. Given € > 0, choose, for each n, a sequence of intervals (J, %|k € N) 
covering E,, and with >, len(In4) < m*(E,) + ae Combining all these sequences 
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of intervals into a single sequence, we get a covering of |), E, with total length 
< do, (m* (En) + ar) = yo, m* (En) + €. Oo 


So we can (and will) prove equalities of the form m*(U,, En) = >¢,,m*(En) by 
only showing m*(U,, En) = >>, m* (En) (by countable subadditivity). 


Corollary 1331. If A is measurable and € > 0 then there are closed F and open 
G such that F C A C G, m*(A) => m*(G) —«, and m*(F) = m*(A) —«. 


Proof. Let € > 0. Fix closed F and open G such that F C A C G andm*(G~NF) < 
€. Then by countable subadditivity and monotonicity, m*(G) < m*(A) + m*(G~ 
A) < m*(A) + m*(G~NF) < m*(A) + €, so m*(A) > m*(G) -€. ee 
m*(F) > m*(A) —e. 


Proposition 1332. Let G be an open set expressed as a disjoint union of open 
intervals U, Jn = G. Then m*(G) = >~,, len(Jn). 


Proof. Easily m*(G) < >, len(J;,) (since the J,’s cover G). 

For the other direction, let (J,,) be any sequence of open intervals covering G. 
Then for each n, (In N Jin | m € N) is a sequence of pairwise disjoint intervals all 
contained in J, and so len(I,) > >>, len(n M Jn). Hence 


D_len(In) 2 > SF len 1 Im) = Y5 Yo tenn A Jim) = 2 len(Jn), 


n m m n 
where the last inequality follows by Borel’s theorem since for each m, the intervals 
(In. 0 Jm| nm €N) cover Jn. 


Corollary 1333. If G; and G» are disjoint open sets then m*(G, U Go) 
m*(G,) + m*(G»). 


Proposition 1334. If F, and F, are disjoint closed sets then m*(F, U Fo) = 
m*(F,) + m*(F9). 


Proof. Let « > 0. Fix open G with F, U Fy C G and m*(F, U Fy) > m*(G) —e. 
Fix disjoint open G; and G2 containing F; and F respectively (Problem 938). Then 
we have m*(F, U Fh) > m*(G) —€ => m*((GN G1) U(GNG2))-e=m oe nN 
Gy) + m"(GN G2) —e =m" (i) +m") =e. 

Proposition 1335 (Finite Additivity). Let A and B be disjoint measurable sets. 
Then m*(A U B) = m*(A) + m*(B). 

Proof. Let € > 0. Fix closed sets F'4 C A and Fg C B with m*(F4) = m*(A)— § 
and m*(Fg) => m*(B) — §. Then m*(A U B) = m*(F4 U Fg) = m*(F4) + 
m*(Fg) > m*(A) + m*(B). Oo 
Proposition 1336 (Countable Additivity of Lebesgue Measure). If Aj, Ao, 

.. are disjoint measurable sets then m*(\),, An) = ¥o,,m* (An). 


Proof. )>m*(An) = sup >> m*(Ax) = sup m* (pe, Ac) < m*(U, An). oO 
n n k=1 n 


The main theorem now follows from and Propositions 1329 and 1336. 


Appendix C 
List of ZF Axioms 


ZF 1 (Extensionality). VxVy(Vz(zEexozey)>x=y). 

ZF 2 (Empty Set). 4xVy(y ¢ x). 

ZF 3 (Separation Scheme). [f @(x, t1, t2,..., tn) isa ZF formula in which the free 
variables are among X,t,,t2,..., tn, then the following is an axiom: 


VUVin---VWtinVadbVx(x Eba x CAA G(X,t,h,...,tn)). 


ZF 4 (Power Set). VxdyVz(z € y = Vw(w €z> We x)). 
ZF 5 (Union). VxdyV2(z € y  AwW(wW Ex AZEw)). 
ZF 6 (Unordered Pairs). VxV yazVw(w €z<aw=xVwe=y). 


ZF 7 (Replacement Scheme). /f 9(x, y,t,2,...,t,) is a ZF formula with free 
variables among the ones shown, then we have the axiom: 


WtyW ty +++ Vth (WV VV 2(9(X, Vo ttt) AQ(X BH... tn) > Y =2) 
> ValbVuVv(u ear o(u,v,t,..-,tn) > VE b)). 
ZF 8 (Infinity). Sb(dy(y €b A Vez ¢ y))A 
Vx(x €b > dy(y EDAVAZ Ey eo zExXVZ=xX)))). 
ZF 9 (Foundation). Vx(Ay(y € x) ~ dy(y € x A 7Aek(zE VAZEX))). 
ZFC is obtained by adding to ZF the Axiom of Choice, which says: 


Vx((Vy(y € x > Ad(z € y))A 
Vu¥vuexAvexAu#v—>Ady(y EuAy€Ev))) 


> dw y(y €x > Ale(zE VAZEW))). 
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Dedekind’s theorem on the real continuum, 
at 
countability, countable sets, 94-100 
countable axiom of choice (CAC), 77, 99-101 
countable chain condition, see CCC 
countable closed bounded sets 
classification of, 310 
cover, covering (of a set by a collection of 
sets), 281 
cumulative hierarchy of sets (V,), 386-387, 
392 


D 
Davis, M., 410 
DC, see axiom of dependent choice 
Dedekind complete orders, see complete orders 
Dedekind completion, 166-168 
Dedekind continuity, 154 
Dedekind cuts, 51-52, 154 
boundary cut, 52, 154 
gap, 51,52, 154 
(a, *wp)s (1, *o1), (a, *w) gaps, 
214, 218-219, 414 
jump, 51,52, 154 
limit point cut, 154 
Dedekind finite, 85-86 
Dedekind infinite, 72, 85-86, 88—90, 100-101, 
374-375 
Dedekind partition, 154 
of ratios, 39 
Scott cut, 39n 
Dedekind, R., 27, 29, 31n, 42, 47-48, 51-54, 
57, 63-64, 67, 70-72, 85-89, 111, 
154, 166-168, 173, 398 
Dedekind—Peano axioms, 29-31, 67, 70, 383 
categoricity of, 41, 70-72 
model for, 87-88 
Dedekind—Peano systems, 70-72 
Dedekind’s theorem on, 41, 71 
dense 
order, 22, 152-153 
ni -orderings, 218-219 
dense orders vs dense subsets, 153 
relative density, 153 
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sets of real numbers, 270-271 

subset of posets, 249 

subsets of orders, 153 
dense-in-itself 

G; sets, 293 

orders, 170-172 

sets of real numbers, 268-270 

subsets of orders, 171 
denumerable set, 95 
derivative, derived set, see also Cantor— 

Bendixson derivative 

in orders, 149-152 

iterated, 151 

of real sets, 267 
D(A), 149-152, 267 
descriptive set theory, 311 
determinacy, 407-409 

analytic, 409 

Borel, 408-409 

definable, 408 

open and closed, 408 

projective, 409-411 
©, (Jensen’s Diamond Principle), see Diamond 

Principle 

Diamond Principle (©), 166, 250, 402 
discrete set of real numbers, 272 
domain, see relations, domain of 


E 
effective 
choice, 77 
choice set, 91 
definition, 91-93 
enumeration 
of N x N, 96 
of Q, 96 
equality of cardinals, 95 
equinumerosity and similarity, 95 
pairing functions, 98 
specification, 92 
effectiveness, 77, 90-93 
embedding 
continuous, of orders, 158 
order, 156 
continuous, 158 
empty 
set (O), 4-5 
string or word (¢), 18 
@, see empty set 
€, see empty string or word 
endpoint, see orders (linear), endpoint 
enumeration, 95 
equiconsistent, 406 
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equinumerosity, 77 
effective, 95 
equivalence class, 19 
equivalence relations, 19-21 
and partitions, 20-21 
eventual containment, 270 
everywhere dense, see dense 
extensionality 
principle of, 3, 370, 421 


F 
F, sets, 290-291 
families, 13-15 
almost disjoint, 118-119, 225-226 
indexed, 13 
inductive, 83 
unindexed, 14 
Feferman, S., 93, 298, 405n 
field, ordered, see ordered field 
filter 
in posets, 249 
fineness property of the ratios, 39 
finite 
cardinals, 86-87 
Dedekind, see Dedekind finite 
induction, see induction, principle of 
(finite) 
ordinals, 382-383 
sequence, see sequences 
sets, 82-84 
Dedekind, see Dedekind finite 
first category sets, 292 
forcing 
method of, 166, 216, 354, 402-404 
poset, 402 
relation, 402 
formalism, 405 
fractions, 34-37 
Fraenkel, A., 367, 369, 376, 398 
Frege, G., 67, 69, 363-364, 397 
Frege—Russell—Scott invariant, 395 
Friedman, H., 408 
function builder notation, 11 
functionals, 375-376 
functions, 10-13 
bijection, 12 
Cantor ternary, 278 
characteristic, 114 
choice, 94 
composition of, 12 
continuous, see continuity, continuous 
maps 
extension, | 1 
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functions (cont.) 
homogeneous set for, 241 
image of 
forward, 11 
inverse, 11 
injective, 12 
notation 
function-builder, 11 
one-to-one, 12 
one-to-one correspondence, 12 
onto, 12 
pairing (effective), 98 
restriction, 11 
surjective, 12 
fundamental theorem of algebra, 65 


G 
G; sets, 290-291 
continuum hypothesis for, 294 
dense-in-itself, 293 
Gédel incompleteness theorem, 416 
Gédel’s Program, 404-405, 411 
Godel, K., 215n, 216, 354, 365, 396, 398, 399, 
401-402, 404-406 
Gaifman, H., 406n 
Gale—Stewart theorem, 408 
Galileo, 85 
games 
Banach—Mazur, see Banach—Mazur game 
infinite, see infinite games 
generalized Cantor sets, see Cantor sets, 
generalized 
Generalized Continuum Hypothesis (GCH), 
see Continuum Hypothesis, 
Generalized 
greatest lower bound, 155, 256 


H 
Harrington, L., 409 
Hartogs’ 
cardinal, 203-205 
ordinal, 203-205 
set, 203-205, 377 
theorem, 203-205 
Hausdorff maximal principle, the, 224 
Hausdorff, F., 159, 195, 217, 218, 224, 229 
Heine—Borel 
condition, 283 
theorem, 281-285 
Hilbert, D., 53n, 72, 216n 
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homeomorphic, homeomorphism of 
order types, 301 
orders, 301-303 
sets of reals, 276-277 
subsets of R with orders and order types, 
302-303 
homogeneous set (for partitions, for functions), 
241 


I 
ideal, o-ideal (of sets), 286-287 
inclusion map, 156 
induction 
principle of (finite), 2, 179, 383 
principle of (over finite sets), 83 
transfinite, see transfinite induction 
inductive 
family, 83 
inductive set, 83 
infimum, 155 
infinitary combinatorics, 245 
infinite 
branch, see tree, infinite branch 
cardinals, 86 
Dedekind, see Dedekind infinite 
sequence, 95 
binary, 115 
sets, 83, 84 
Dedekind, see Dedekind infinite 
infinite games, 407-409 
inner models, 402n 
intermediate value theorem, 50, 173-174, 276 
as characterization of the continuum, 53, 
174 
failure of, 49-50 
intervals 
in orders, see orders (linear), intervals 
of real numbers, see real numbers and sets, 
intervals 
invariant, see complete invariant 
Frege—Russell—Scott, 395 
irrationals 
Dedekind’ definition of, 51 
isomorphism 
finite partial, 160 
of orders, 135-136 


J 
Jensen’s Diamond Principle (), 166, 250, 402 
Jensen, R., 166, 250, 397 


Index 


K 
KGnig’s 

inequality, 125, 126 

cofinality version, 213 

Infinity Lemma, 237-238, 246 
KGnig, J., 126 
Kechris, A. S., 409, 410 
Kelley, J. L., 396 
Kleene—Brouwer order, 147, 162, 326 
Kuratowski, K., 8n, 92, 345, 373 


L 
L, see constructible sets 
lambda-calculus, 366 
Landau, E., 38n, 57n, 59n, 64n 
large cardinals, see cardinal numbers, large 
cardinals 
least upper bound, 155, 256 
Least Upper Bound property, 155 
Lebesgue measurability 
of all sets of reals, 404, 408 
of analytic sets, 333-335 
of PCA (£5) sets, 354-355, 402, 404, 406, 
409 
of projective sets, 404, 409-410 
Lebesgue measurable sets, 287—290, 419 
non-measurable sets, 297, 299 
Lebesgue measure on R, 289-290, 417 
CCC property, 290 
existence, 419-420 
monotonicity, 289 
outer regularity, 289 
translation invariance, 289 
uniqueness, 289 
Lebesgue measure zero, 285-287 
Lebesgue, H., 285-290, 324, 419 
lengths (magnitudes), 54-58 
lexicographic 
see orders (linear), 218 
limit points (lower, upper) 
in orders, 149-152 
of order w, 151 
second and higher order, 151 
two-sided, 149 
of real sets, 266-267 
two-sided, 267 
Liouville 
constant, 107 
Liouville, J., 107 
logicism 
logicist program, 363-365 
long line, the, 201 
Lusin separation theorem, 33 1-333 
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Lusin’s problem, 352-355, 402 
Lusin, N., 216, 332, 342, 352-355, 406 


M 
magnitudes (lengths), 54-58 
signed, 58 
Martin’s Axiom (MA), 249-250, 354 
Martin, D. A., 354, 389n, 407-410 
Mazur, S., 410 
Mazurkiewicz, S., 343 
meager sets, 292 
measurable 
cardinal, see cardinal numbers 
sets, see Lebesgue measurable sets 
measure problem, 299, 345-352 
measure zero, see Lebesgue measure zero 
measures 
«-complete, 347 
atomless, 347 
continuous, 347 
finite, 347 
non-trivial, 347 
probability, 347 
total, 347 
two-valued, 347 
monotone 
convergence property, the, 170 
real functions, 128 
sequences (increasing, decreasing), 170 
monotone order property, 245 
Morse, A. P., 396 
Morse-Kelley set theory (MK), 396 
Moschovakis, Y., 410 
Mostowski, A., 390 
Mycielski, J., 408, 410 


N 
natural numbers 
defined, 87 
nested interval property, the, 61, 256 
and complete orders, 168-170, 206, 214 
Cauchy, 61 
in R, 256 
sequential, 169, 206, 214 
strong, 169, 206, 214 
Neumann, J. von, 69-70, 72, 377-378, 381, 
385-386, 395, 398 
New Foundations, see NF set theory 
NF set theory (of Quine), 397 
stratified formula, 397 
non-measurable sets, 297, 299 
nowhere dense sets of real numbers, 272-275 
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numbers 
algebraic, 106 
cardinal, see cardinal numbers 
ordinal, see ordinal numbers 
transcendental, 106 


oO 
open sets, see real numbers and sets 
order types, 138-145 
w, €,n, A, 138 
characterization of 
order type 7 of the rationals, 161 
order type A of the reals, 165 
defined as Frege—Russell—Scott invariant, 
395 
operations of, 138-145 
product, 143-145 
sum, 139-142 
reverse, 138 
symmetric, 138 


ordered 
n-tuple, 16 
field, 58-62 


definition of, 60 
of the real numbers, 58-61 
properties of, 62 
pair, 8, 373 
Kuratowski’s definition, 8n, 373 
orders (linear), 21-23, 131-133 
ni-orderings, 218-219, 414 
anti-lexicographic, 142 
bounded sets (below, above), 133 
bounds (lower, upper), 133 
greatest lower bound, 155 
infimum, 155 
least upper bound, 155 
supremum, 155 
CCC (countable chain condition), 163-164, 
247 
closed subsets, 172 
cofinal subset, 135 
coinitial subset, 135 
complete (Dedekind), see complete orders 
completion (Dedekind), 166-168 
continuity, continuous maps, 50, 157 
continuous (Dedekind continuity), 154 
continuous embedding, 158 
continuum, 154 
dense, 22, 152-153 
ni-orderings, 218-219 
dense orders vs dense subsets, 153 
relative density, 153 
subsets, 153 
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dense-in-itself, 170-172 
derived set, derivative, 149-152 
D(A), 149-152 
embedding, 156 
continuous, 158 
endpoint, 22, 133 
gaps, (Wg. *wg), (1, *01), (@, *w), 214, 
218-219, 414 
intervals, open and closed, 135 
isomorphism, 135-136 
Kleene—Brouwer order, 147, 162, 326 
lexicographic, 142 
powers, 218 
limit points (lower, upper), 149-152 
of order w, 151 
second and higher order, 151 
two-sided, 149 
monotone order property, 245 
ordinal, see ordinal numbers 
perfect subsets, 172 
predecessors, 133 
immediate, 22, 133 
rearrangements, 136-138 
reverse, 137, 138 
segments, initial and final, 135 
separable, 164 
short, 226-228 
similar, similarity of, 135-136 
suborders, 134 
successors, 133 
immediate, 22, 133 
symmetric, 138 
types, see order types 
well-orders, see well-ordering 


orders (partial), see posets 
ordinal numbers (ordinals), 175-179 


canonical order, 195-197 

Cantor normal form, 198 

club (closed unbounded) sets in W(@), 

202-203 

cofinality, see cofinality 

comparability theorem for, 185 

countable ordinals, 193, 199-201 

division algorithm, 191 

epsilon numbers, 193 

€0, 193 

even (and odd), 191 

expansion in powers of a base, 197 

exponentiation, 191-195 
Hausdorff’s definition of, 195 

finite, 382-383 

Hartogs’, see Hartogs’ ordinal 

initial ordinals, 204 

@q; 205 
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initial set of, 186, 381 
W(a), 184-186 
least uncountable ordinal @,, 200 
@, 200 
limit, 177, 381 
limit of a set of, 188 
normal functions on, 194 
odd (and even), 191 
operations defined by transfinite recursion, 
189-191 
ordering of (comparing), 183 
product (multiplication), 177, 187 
defined by transfinite recursion, 190 
product-closed, 194 
rank, rank function, see rank 
remainder ordinals, 191-195 
characterization of, 194 
second number class, 204 
subtraction, 190 
successor, 177, 381 
successor of, 187 
sum (addition), 177 
defined by transfinite recursion, 189 
sum-closed, 194 
supremum of a set of, 188 
transfinite induction, 179-181 
transfinite recursion over, 189, 381 
Von Neumann ordinals, 377-389 
comparability theorem for, 379 
definition of, 380-381 
existence, 380 
uniqueness, 379-380 
well-ordered sum of, 186-187 
outer measure, 419 


P 
pairing functions (effective), 98 
paradoxes, set-theoretic, 361-363 
Burali-Forti paradox, the, 361, 381 
Cantor’s paradox, 362 
impact on the logicist program, 363-364 
resolutions of, 364-367 
Russell’s paradox, 362-363 
partial orders, see posets 
partitions, 15—16 
and choice (axiom of), 90-93 
and equivalence relations, 20-21 
homogeneous set for, 241 
PCA sets (x3 sets), 352 
Baire property of, 354-355, 402, 404, 406, 
409 
Lebesgue measurability of, 354-355, 402, 
404, 406, 409 
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perfect set property for, 354, 406, 409 
regularity properties of, 352-355, 406, 
409 
Peano Arithmetic, 29 
Peano curves, 278-279 
Peano, G., 29 
perfect set property, 295 
for all sets of reals, 404, 408 
for analytic sets, 335-337 
for coanalytic sets, 353, 354, 402 
for PCA (£34) sets, 354, 406, 409 
for projective sets, 404, 409-410 
perfect sets, 303-305, 335-337 
cardinality of 
in R, 294 
in complete orders, 173 
in orders, 172 
of real numbers, 268-270, 274-275, 294 
property, see perfect set property 
platonism, 405 
pluralism, 405 
Polish spaces, 343 
posets (partial orders), 221-229 
antichain, 222 
bounded set (below, above), 222 
bounds (lower, upper), 222 
CCC (countable chain condition), 249 
chain, 222 
comparable and incomparable elements, 
222 
containing 7; chains, 228 
P(N) modulo finite sets, 228 
order of magnitude for positive 
sequences, 228 
orders of infinity for sequences, 228 
strict dominating order, 229 
dense subset, 249 
downward closed subset, 222 
embedding of, 223 
filters in, 249 
greatest and least element, 222 
initial part, 222 
isomorphisms of, 223 
maximal and minimal element, 222 
reflexive, 221 
representation theorem for, 223 
strict, 221 
strictly increasing maps on, 223 
power set, 6 
pre-well-ordering, 208 
primitive recursion, 42-45 
definition by, 44-45 
principle of definition by, 45 
Principia Mathematica (PM), 365-366 
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principle of 
abstraction, 20 
comprehension, naive, 3, 363-364 
definition by primitive recursion, 45 
extensionality, 3, 370, 421 


finite induction, see induction, principle of 


(finite) 
induction (finite), see induction, principle 
of (finite) 
recursive definition, 42-44 
transfinite induction, recursion, see 
transfinite induction, recursion 
projective determinacy, 409-411 
projective sets, 352-355 
Baire property of, 404, 409-410 
Lebesgue measurability of, 404, 409-410 
perfect set property for, 404, 409-410 


regularity properties of, 352-355, 409-410 


property of Baire, see Baire property 


Q 
Quine, W. V. O., 78, 365, 365n, 397 
quotient map, 20 


R 
Ramsey’s theorem, 241-243, 245-246 
general, 242 
Ramsey, F. P., 365 
range, see relations, range of 
rank (ordinal) 
Cantor—Bendixson (CB-rank), see 
Cantor—Bendixson rank 
for well-founded trees, 238 
of regular sets, 392-393 
on well-founded structures, 230-232 
of elements, 231 
of structure, 231 
rank function (ordinal) 
for abstract derivatives, 207 
for well-founded relations, 231 
canonical, 231 
rational numbers 
b-adic, dyadic, triadic, 262 
repeating infinite digit expansions of, 262 
ratios, 34-41 
Archimedean property of, 39 
Dedekind partition of, 39 
fineness property of, 39 
inadequacy of (in geometry and algebra), 
49-50 
integral, 37 
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nonsquare, 40 
square, 40 
density of, 40 


R, the set of all real numbers, see real numbers 
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analytic, see analytic sets 
Baire property, see Baire property 
Bernstein sets, see Bernstein sets 
Borel, see Borel sets 
bounded set, 256 
bounds (lower, upper), 255-256 
greatest lower bound, 256 
infimum, 256 
least upper bound, 256 
supremum, 256 
closed sets, 268-270 
closure, 268 
comeager set, 292 
compactness, 277 
condensation points, 273-274, 305 
continuity of a function at a point, 275 
continuous functions on, 275-276 
convergent sequence, 270 
countable closed bounded sets 
classification of, 310 
definition of real numbers and R, 58 
dense (everywhere dense) sets, 270-271 
dense-in-itself sets, 268-270 
derived set, derivative, 267 
D(A), 267 
discrete sets, 272 
everywhere dense sets, 270-271 
F, sets, 290-291 
first category set, 292 
G; sets, 290-291 
homeomorphisms, homeomorphic sets, 
276-277 
intervals, 1, 101, 255 
bounded, 102 
closed, 255 
half-infinite, 102 
nested ternary sequence of, 261 
open, 255 
proper and improper, 102, 255 
subdivision trees of, 257 
ternary subdivisions of, 258 
isolated point, 267 
limit points (lower, upper), 266-267 
two-sided, 267 
meager set, 292 
measurable, see Lebesgue measurable sets 
measure zero, see Lebesgue measure zero 
nested intervals theorem, 256 
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nowhere dense sets, 272-275 
open sets, 265-266 
canonical decomposition into intervals, 
266 
countable base for, 265 
countable chain condition, 266 
PCA (2 5) sets, see PCA sets 
perfect set property, see perfect set property 
perfect sets, 268-270, 274-275, 294, 
303-305, 335-337 
regularity properties, see regularity 
properties 
residual set, 292 
somewhere dense sets, 272 
strong measure zero, 287 
ternary expansions of, 261 
Vitali sets, see Vitali sets 
realism, 405 
recursive definition, 42-45 
basic principle of, 42 
principle of, 42-44 
reflection, 85 
reflexive 
cardinals, 89 
sets, 72, 85, 374-375 
regular sets, 391-393 
rank of, 392-393 
regularity properties 
of analytic sets, 337-338 
of PCA (5) and projective sets, 352-355 
relationals, 375-376 
relations, 8-10 
antisymmetric, 9 
asymmetric, 9 
composition of, 9 
connected, 9 
domain of, 8 
equivalence, see equivalence relations 
inverse, 9 
irreflexive, 9 
product of (relative), 9 
properties of, 9 
range of, 8 
reflexive, 9 
symmetric, 9 
transitive, 9 
transitive closure of, 84, 232 
well-founded, see well-founded relations 
relativization, 399 
residual sets, 292 
reverse mathematics, 243 
Robinson, R. M., 396 
Rowbottom, F., 406n 
Russell set, 363 
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Russell’s paradox, 362-363 
Russell, B., 67, 69-70, 93, 113, 361-365, 367, 
397 


N) 
Schréder, E., 111 
Schrdéder—Bernstein theorem, see Cantor— 
Bernstein theorem 
Scott, D., 78, 394-395, 406n 
second number class, 204 
segments 
in orders, see orders (linear) 
of sequences, strings, see sequences, strings 
selector, 93 
separable orders, 164 
separating family, 347 
sequences, 16-19 
binary 
finite, 117 
infinite, 115 
Cauchy, 270 
concatenation of, 18 
convergent, 170, 270 
extension of, 18 
finite, 16 
infinite, 95 
limits of, 270 
monotone (increasing, decreasing), 170 
prefix (initial), 18 
segment (initial), 18 
uniqueness of limits of, 270 
set builder notation, 4 
set, sets 
(Boolean) algebra of, 7 
analytic, see analytic sets 
Bernstein, see Bernstein sets 
Borel, see Borel sets 
choice, 91 
comeager, 292 
constructible, see constructible sets 
countable, 94-100 
cumulative hierarchy of (V,), 386-387 
Dedekind finite, 85—86 
Dedekind infinite, 72, 85-86, 88-90, 
100-101, 374-375 
denumerable, 95 
empty (@), 4-5 
equinumerous, 77 
effectively, 95 
F,, 290-291 
finite, 82-84 
Dedekind, see Dedekind finite 
first category, 292 
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set, sets (cont.) 
G3, 290-291 
Hartogs’, see Hartogs’ set 
ideal, o-ideal (of sets), 286-287 
inductive, 83 
infinite, 83, 84 
Dedekind, see Dedekind infinite 
meager, 292 
measurable, see Lebesgue measurable sets 
measure zero, see Lebesgue measure zero 
membership, 2 
notation 
brace-list, 5—6 
set builder, 4 
of uniqueness, 311 
operations, 6-7 
power, 6 
reflexive, 72, 85, 374-375 
regular, 391-393 
rank of, 392-393 
residual, 292 
similar, similarity of, 77 
effective, 95 
singleton, 4-5, 78 
strong measure zero, 287 
successor of, X +, 379 
transitive, see transitive sets 
Vitali, see Vitali sets 
well-founded, see regular sets 
set-theoretic paradoxes, see paradoxes, 
set-theoretic 
SH, see Suslin hypothesis 
Shelah, S., 404 
short linear orders, 226-228 
Sierpinski’s theorem, 318-319 
Sierpinski, W., 313, 318-319 
X5 sets, see PCA sets 
o-ideal (sigma ideal) of sets, 286-287 
sigma-algebra (o-algebra), 321-322 
CCC modulo a o-ideal, 335 
Silver indiscernibles (sharps), 409 
Silver, J., 355, 406n 
similarity 
of orders, 135-136 
of sets, 77 
effective, 95 
singleton, see set, singleton, see set, singleton 
Skolem, T., 367, 369, 371, 398 
Solovay, R. M., 216, 299, 352n, 354-355, 404, 
406-409 
space filling curves, 278-279 
Steel, J., 407, 410 
Steinhaus, H., 408 
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string, 16-19 
concatenation, 18 
empty (€), 18 
extension, 18 
prefix (initial), 18 
segment (initial), 18 
ternary strings, 259 

strong measure zero, 287 

structuralism, 67, 70-72 

successor of a set, X +, 379 

supremum, 155 

Suslin 
Hypothesis, the (SH), 166, 249, 402 

independence of, 404 
line, 247-248 
operation A, 326-330 
Problem, the, 166, 247 
Suslin’s theorem, 333 
systems, 328 
tree, 248 
normal, 248 
Suslin, M. Y., 166, 324, 333 
Swierczkowski, S., 410 


T 
Tarski, A., 53n, 365 
theory of types, 364-366 
simple, 365 
topological properties, 277 
transfinite induction, see well-ordering, ordinal 
numbers, well-founded relations 
transfinite recursion, see well-ordering, ordinal 
numbers 
transitive sets, 382 
transitive closure of a set, 390-391 
tree, trees, 234—240, 324-326 
Aronszajn, 246, 248 
binary, 117-119 
full, 236 
branch, 235 
finitely-branching, 235 
height 
of a tree, 235 
of an element htr (x), 235 
infinite branch 
as digit string, 260 
as nested intervals, 261 
through finitely branching trees, see 
KGnig Infinity Lemma 
through the binary tree, 118 
through trees, 259 
Konig Infinity Lemma, 237-238 


Index 


levels of, Leva (T), 235 

nodes, 234 

of strings over a set, 236 

over a set, 236 

representation theorems for, 236-237 

subtree, 235 

Suslin, 248 
normal, 248 

tree property of cardinals, 246 

well-founded, 238-240, 325-326, 339-342 
existence of (all ranks), 240 
ranks for, 238 
truncated ranks for, 239 

types 
order, see order types 
theory of, see theory of types 


U 

Ulam matrix, 333, 335 

Ulam, S., 299, 346, 349-351 

uniformization, 93 

uniqueness problem for trigonometric series, 
310 

universal sets, 343 

universe, set theoretic, 393 


Vv 
V, the set theoretic universe, 375, 393 
V,, the cumulative hierarchy of sets, 386-387, 
392-394 
V=L, see axiom of constructibility, see axiom 
of constructibility, see axiom 
of constructibility, see axiom of 
constructibility 
Vitali sets, 297-298, 335, 345-346 
Von Neumann 
ordinals, 377-389 
comparability theorem for, 379 
definition of, 380-381 
existence, 380 
uniqueness, 379-380 
well-order, 378-381 
comparability theorem for, 379 
existence, 380 
uniqueness, 379-380 
Von Neumann, J., see Neumann, J. von 
Von Neumann-Bernays set theory (VNB), 
395-396 


WwW 
weakly compact cardinals, see cardinal 
numbers 
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Weierstrass, K., 29 
well-founded relations and structures, 
229-234 
canonical rank decomposition, 230 
extensional, 390 
Mostowski’s theorem, 390 
ordinal ranks, 230-233 
of elements, 231 
of structures, 231 
rank functions for, 231 
canonical, 231 
transfinite induction, 230, 233 
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Zermelo set theory (Z), 387-389 
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Z, see Zermelo set theory 
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Zermelo—Fraenkel system, see ZF set theory 
ZF set theory, 367, 369-385 
language of (formal), 369-370 
atomic formulas, 369 
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free occurrence of a variable, 370 
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